Proposal to add 'angles' attribute to NXdata groups

rayosborn commented 2 years ago

NXdata groups that contain three-dimensional data representing S(Q) are often stored with axes that are parallel to the principal reciprocal lattice vectors (e.g., H, K, L). When transformed to create 3D pair-distribution functions (3D-PDF), the axes would then be parallel to the real-space lattice vectors (e.g., X, Y, Z). For many space groups, these vectors are not orthogonal, so when plotted, the axes should be skewed. In our data reduction package, NXRefine, we have demonstrated that it is possible to automate the process of plotting such data by adding a group-level attribute, called angles that contain the angles between the corresponding axes. This allows an application, like NeXpy, to plot the data with the appropriate skew angle.

This proposal is to make this feature a part of the NeXus standard. When combined with using the scaling_factor attribute on the axis fields to convert to reciprocal Ångstroms, this allows automatic plotting of single crystal data with aspect ratios that are correct in absolute units.

transform:NXdata
  @angles = [90.    70.373 90.   ]
  @axes = ['Ql', 'Qk', 'Qh']
  @signal = 'data'
  Qh = float64(1601)
    @long_name = 'H (rlu)'
    @scaling_factor = 0.4346933977453787
  Qk = float64(801)
    @long_name = 'K (rlu)'
    @scaling_factor = 1.738438275828753
  Ql = float64(1201)
    @long_name = 'L (rlu)'
    @scaling_factor = 0.6639555501603271
  data -> 30K/masked_transform.nxs['/entry/data/v']

Such a scheme would be confined to two- or three-dimensional data, since the attributes would need to store matrices for higher dimensions, e.g., S(Q, E).

woutdenolf commented 2 years ago

From my experience in crystallography this seems like a very unusual way of describing a non-standard basis (e.g. the unit cell basis or its reciprocal basis).

The common way is to provide the coordinates of the non-standard Euclidean basis with respect to the standard Euclidean basis. When concatenating these vectors as columns in a matrix, you have the transformation matrix which allows converting coordinates from space group to standard basis:

(x₁,x₂,...,x_n)=x₁e₁+x₂e₂+...+x_ne_n

Alternatively you could also use a metric tensor (called a Gram matrix in Euclidean space) to define the basis in which the coordinates are given. This calls for a separate base class imo. Either a class to define a basis in any metric space or perhaps be more practical and stick to Euclidean space. Then the NXdata group can have it if the plot axes need to be different from the data axes.

jonwright commented 2 years ago

Hi @woutdenolf, @rayosborn, You already have this in the UB matrix. The columns are the reciprocal space lattice vectors (a, b, c*) and they appear to be used as the basis here. For this case you might want to give the number of points sampling along each axis and refer to your UB that is in nxsample? Best, Jon

rayosborn commented 2 years ago

@woutdenolf , @jonwright, the point of this attribute is not to store crystallographic information. In other words, this is not proposed to be part of an application definition that is specific to single crystal diffraction. We already store the lattice parameters (a, b, c, α, β, γ) orientation_ matrix (i.e. U matrix), space group, etc, elsewhere as prescribed by the base class definitions. The purpose is to provide a generic way of plotting skew axes automatically for any kind of data stored in a NXdata group without requiring any prior knowledge of what the data means.

In NeXpy, if the option to plot with equal aspect ratio is activated, it then looks for this information to decide if there should be a skew angle applied. If a user switches plotting axes, NeXpy automatically plots with the new skew angle. The attribute just provides additional information about how to plot a NXdata group than contained in the signal and axes attributes. That is in the spirit of the NXdata definition - from the very beginning it was designed to allow generic plotting programs to determine how the data should be plotted, without having to look elsewhere in the file.

This has proved to be a significant time-saver over the past year since it was implemented. Before this, we had to look up the lattice parameters and manually set the aspect ratio and skew angles in the Customize Panel. Now, it is automatic by clicking a single button. I should think it's a useful convenience in other applications as well.

woutdenolf commented 2 years ago

the point of this attribute is not to store crystallographic information

Agreed. But the angles with the scaling factors basically define a Euclidean transformation in an unconventional way. I vote for using the more standard way of representing coordinate transformations (like the UB matrix does for a specific case).

rayosborn commented 2 years ago

I am worried that we have had problems in the past in NeXus with definitions that were, in principle, more rigorous, but in practice never implemented, either because they were unnecessarily complicated or nobody understood them or both. For this reason, I have argued repeatedly in the NIAC that we should never make changes to the standard without showing how they would be used in practical applications, preferably with actual code.

If you can define what such attributes would look like, perhaps amending my example above, and then provide at least pseudo-code that demonstrates how to use those attributes to plot the skewed data in a plotting package like Matplotlib, then I will happily look at it.

jonwright commented 2 years ago

@rayosborn , are you using this code to plot ? https://github.com/nexpy/nexpy/blob/3028a8fbf5860b0dc61726a4df04ca317861cdeb/src/nexpy/gui/plotview.py#L1094 In your picture, it seems the the "L (rlu)" axis is not coming out parallel to the lattice direction, and you are creating some vectors in the transform function. It seems plausible to write in terms of vectors directly, but perhaps there is some reason why not? Ideally you would want to skew the matplotlib axis too.

Someone else might have decided to make the plot using an affine transform of the image as described here and then add axes via a quiver: https://matplotlib.org/stable/gallery/images_contours_and_fields/affine_image.html

For that approach, they could decide to initialise the affine transform matrix using the axis vectors for the projection slice, or via an implied unit matrix and then call rotate, skew and scale to find the affine matrix vectors as suggested above? e.g.:

t0 = matplotlib.transforms.Affine2D( [[ x0, y0, 0], [ x1, y1, 0], [0, 0, 1]] )
t1 = matplotlib.transforms.Affine2D( ).scale(2,1.5).skew(30,5)
t2 = matplotlib.transforms.Affine2D( ).skew(5,30).scale(2,1.5)

For the generic plot question, it seems to correspond to "what is the vector along the array axis?". If we measure 2D image data by scanning motors underneath a sample, and they are not orthogonal then I prefer to know: what are the direction vectors in the laboratory frame? The lengths and angles between them are easy to find from vectors. The same question comes up for pixel directions in images (NXdetector -> [fast|slow]_pixel_direction, which appear to be specified as vectors currently.

Adding some recipes to show how to use axis vectors and transforms might be helpful?

rayosborn commented 2 years ago

@jonwright, thank you for taking the time to check. I would certainly welcome closer inspection of the code by outsiders. If I understand your point correctly, you are suggesting the L-axis itself should be skewed. This seemed to be the way Matplotlib implemented skewed axes and it never bothered me although I can see that some people might think that the L-axis is actually vertical. We have published images like the one above, and no referee has complained but I agree it could be a source of confusion. Perhaps I should update the documentation to make this clear.

The affine images you link to also seem to have orthogonal axes but I am guessing that Matplotlib does have options to skew the axes themselves. I implemented this a few years ago when it was part of the mpl_toolkits library, which I think represents experimental code. It is possible that I should be doing something different now. If you have suggestions, I am certainly willing to try them out.

jonwright commented 2 years ago

For the example I guess the beam stop is at (0,0) and the y axis should be inclined ? Counting the spots, I don't understand the axis scales ... was the horizontal axis doubled for a superstructure ?

Finding a matplotlib recipe seems like a puzzle worth solving ... I guess you would ideally want to plot arbitrary sections out of the 3D volume?

rayosborn commented 2 years ago

The plot shows data after the coordinate transform to reciprocal space, and since this is a slice at K=0, the beam stop is at the origin. There is no cell doubling. I'm showing a plot with a grid showing that only integer H,K,L values have peaks. NaxV2O5_transform Did I understand your question correctly?

We should probably move this discussion to the NeXpy Github, either opening a discussion or an issue, since it's not really a NIAC issue. As I said, I welcome feedback, and a PR would be even better.

jonwright commented 2 years ago

Sorry @rayosborn , I missed that the H=odd are all missing due to the space group (C2/m).

How to store S(Q) for a crystal in a nexus file seems more of a generic question (https://imaged11.readthedocs.io/en/latest/rsv_mapper.html).

This issue suggests an axis in NXdata could usefully include an NXtransformation description?

woutdenolf commented 2 years ago

I am worried that we have had problems in the past in NeXus with definitions that were, in principle, more rigorous, but in practice never implemented, either because they were unnecessarily complicated or nobody understood them or both.

Good point. So not metrix tensors but simple affine transformations then. I'll make a proposal when I find some time.

woutdenolf commented 2 years ago

What I propose is this (example in 3D can be used for nD)

myplot:NXdata
  change_of_frame = float64(4, 4)
    @transformation_type = 'affine'
  @axes = ['Ql', 'Qk', 'Qh']
  Qh = float64(1601)
    @long_name = 'H (rlu)'
  Qk = float64(801)
    @long_name = 'K (rlu)'
  Ql = float64(1201)
    @long_name = 'L (rlu)'

A detailed description of what this means can be found here. In the jupyter notebook I provide motivation, code examples, theory and comparison to other NeXus classes.

Adding this change-of-frame matrix to NXdata allows data with axis coordinates in the data frame

to be plotted in a different frame (the plotting frame)

The 2D example above cannot be handled by the angles representation due to the origin shift.

The benefits proposal 2 (change-of-frame matrix) has with respect to proposal 1 (angles and scaling):

the representation of the reference frame is better suited for plotting libraries
proposal 1 is biased towards crystallography
proposal 1 does not allow translations which is the most basic transformation
you can derive proposal 1 from proposal 2 but NOT vice versa (missing degrees of freedom)

rayosborn commented 2 years ago

@woutdenolf, thank you so much for putting in all this work. I will look through the Jupyter Notebook to make sure I fully understand what you are proposing, but if it looks practical to implement, I will start testing it out in NeXpy as soon as I can. We have quite a few files using the angles attribute, but I'm used to handling backward-compatibility issues in NeXus.

rayosborn commented 2 years ago

This looks very promising. I have been looking through the reference frames and am pleased with the plotted results, which you also show above. You show how to plot affine transformations in Matplotlib, but I don't think you actually create a NXdata group with the proposed new definition and show how it generates those plots. I think it will be helpful to see how the change_of_frames array is used in practice, particularly for the more general case of (n+1)x(n+1) transforms, rather than inferring it from the test function, which only shows a 3x3 example.

If you don't want to have to save the example data to a file, in the nexusformat package, you can generate NeXus groups on-the-fly. So an example would be as simple as:

from nexusformat.nexus import NXdata, NXfield

a = 70.0
signal = NXfield(np.array(...), name='data')
Qh = NXfield(np.linspace(...), name='h')
Qk = NXfield(np.linspace(...), name='k')
data = NXdata(signal, (Qk, Qh))
data.change_of_frame = [[]]

transformed axes = plot_transform(data)
plt.figure()
...

Is it possible to write a generic plot_transform function that returns a usable result, or is it too dependent on how the plotting package does the transformations?

I will look into producing a test implementation of this within the nexusformat package, although I might restrict myself for the time being to 3x3 change-of-basis transformations and leave translations until later. In the standard, I would guess that we could allow the change_of_frame matrix to be either (3x3) or (4x4), to make it simpler for users who don't need translations.

P.S. I'm not sure I like the name change_of_frame. My vote would be for something like frame_transform.

rayosborn commented 2 years ago

I would also appreciate an example where the change_of_frame matrix includes changes in the vector lengths. I realize this can be inferred from the examples, but it requires reverse-engineering the code to work out what is going on and so is not as useful to the general user. Apart from this, I think this could be a major addition to the standard so thanks again for all the work.

woutdenolf commented 2 years ago

but I don't think you actually create a NXdata group with the proposed new definition

Why not? The call plot_nxdata2d(signal, axis0, axis1) plots the NXdata with matplotlib without cof and plot_nxdata2d_transform(signal, axis0, axis1, cofxy) plots the NXdata with matplotlib with cof. I didn't actually save the data in HDF5 but it makes no difference. Those function parameters are the NXdata fields (signal, axes and cof).

Is it possible to write a generic plot_transform function

Yes I guess you could do a transformation+regridding based on the cof with NaN values on pixels/voxels that have no data. This might not be the fastest and memory efficient way though.

I would also appreciate an example where the change_of_frame matrix includes changes in the vector lengths

It can describe any geometric transform, changing vector lengths is included. The example is complex enough I would think: it contains non-isotropic scaling, shear and translation.

I'm not sure I like the name change_of_frame. My vote would be for something like frame_transform.

We can discuss this but note that there is often a confusion between the change-of-frame matrix (new frame with respect to old frame) and the coordinate transformation matrix (convert old coordinates to new frame). They are each other's inverse. The name frame_transform could refer to either. Of course we will properly document it but still.

I would guess that we could allow the change_of_frame matrix to be either (3x3) or (4x4),

That's what the transformation_type is for. You can specify that your transformation does not have a translation (the type with be linear, which is affine minus translation).

Note that you can easily convert the cof to your convention and then you can keep using your code which already deals with the convention your proposal as I understand.

Ultimately none of this is complicated: we are simply adding a coordinate transformation to NXdata. My proposal does this by giving the associated cof which is the most common way in which a reference frame is represented in data science, computer science and mathematics.

woutdenolf commented 2 years ago

If you give me a real HDF5 file that uses your representation I can give the equivalent in my representation.

rayosborn commented 2 years ago

Perhaps I phrased the comment clumsily. The point is not that the components of a NXdata group are not present, but that they are never explicitly defined as part of a NXdata group. I am assuming that this notebook will form the basis of documentation for an end user, but the connection to NeXus is not explicit because at no point can you show the NeXus file tree. You have variables called img, axis0, axis1, and cof, so I can guess how these form a NXdata group, but that's because I have spent much of my life working with them. A new user will have no idea how this relates to NeXus.

I know some people seem reluctant to use the nexusformat package, but in documentation, it has the advantage of directly instancing actual NeXus groups, fields, and attributes, which can then be printed as a tree, making the proposed structure of the NeXus file and the connection to any NXDL documentation completely transparent. This can be done without having to save anything to a file. I will produce a version of your notebook, which will, I think, make this clearer. It will also help me understand exactly what you have done.

Thanks for the clarification of transform types, i.e., affine vs linear. Again, I think it is worth showing these examples in NeXus trees. I think the proposal is looking very strong, so I think this is likely to be an issue of what documentation style works best for a new user.

woutdenolf commented 2 years ago

I modified the jupyter notebook in the hope it clarifies things: the cof describes the plotting frame (given by the two white arrows) in the data frame.

woutdenolf commented 2 years ago

I am assuming that this notebook will form the basis of documentation for an end user

No, definitely not (too elaborate). This notebook can be included in the Nexus documentation as a rigorous description of reference frames in NeXus (the part with proposal 1 and proposal 2 will be removed).

The NXdata documentation will contain more or less the section "Proposal 2: _change_offrame attribute" of the notebook, perhaps with the example I provided in the previous comment

woutdenolf commented 2 years ago

I messed up some of the documentation in the notebook (it is fixed now). It should say

T is the position of the origin of the plotting frame in the data frame and the columns of C are the coordinates of the basis vectors of the plotting frame with respect to the basis of the data frame

woutdenolf commented 2 years ago

Hopefully the example shows why it is easier to provide the change-of-frame matrix and not the coordinate transformation matrix (its inverse, btw the inverse might not even exist).

rayosborn commented 2 years ago

@woutdenolf, I have started to look into this again. The crystallography example is very nice, but it does the inverse of the use case that was behind this proposal, i.e., it assumes that the reciprocal lattice axes are skewed in the raw data and that the Affine2D transform then plots them as orthogonal. This is because you have chosen to plot axes that make the upper-left 2x2 matrix of cofxy diagonal. In the example that I gave at the start of this issue, it starts with a grid that is rectilinear in the reciprocal lattice but is then skewed by beta. For this case, the cofxy sub-matrix would have to be non-diagonal. Would it be possible to modify the crystallography example, since the purpose of this proposal was to accommodate this real-life use case and I think it is a problem that will be more commonly encountered by users.

woutdenolf commented 2 years ago

Sure, I'll make a proper MR with the example as you suggested (after I'm back from holidays). Or perhaps keep the two use cases? Skew straight axes or straighten skewed axes, in the end it's all the same principal of course.

woutdenolf commented 1 year ago

@rayosborn I made some progress: https://github.com/nexusformat/definitions/pull/1033

NXdata rendering of the new transformation matrix field:

https://hdf5.gitlab-pages.esrf.fr/nexus/nxdata_transformation/classes/base_classes/NXdata.html#nxdata-transformation-field

Additional documentation and code examples for reference frames:

https://hdf5.gitlab-pages.esrf.fr/nexus/nxdata_transformation/notebooks/index.html

I modified the example to be closer to your use case.

I'm not fully done but this should be more or less it.

nexusformat / NIAC

Proposal to add 'angles' attribute to NXdata groups #102