pyxem / diffsims

An open-source Python library providing utilities for simulating diffraction
https://diffsims.readthedocs.io
GNU General Public License v3.0
46 stars 26 forks source link

Better Defining Diffsims Purpose #170

Open CSSFrancis opened 3 years ago

CSSFrancis commented 3 years ago

I realize that this feature request is a pretty vague but I wanted to just bring up some changes to diffsims that I am interested in making, most of which surrounding adding some more tools for visualization and labeling of simulations.

I think diffsims preforms simulations very well, but it struggles when you have to visualize these simulations or put them into context (partially because originally it was never really meant for visualization)

A couple of examples where some additional information might be helpful:

In general I think that diffsims could be particularly useful as a substitution for something like JEMS. Simulating structures from CIF files, showing zone axis patterns etc.. but that would probably require leaning on pyxem/hyperspy pretty heavily for visualization and labeling. (which I think was originally something that diffsims wanted to avoid)

There is also the option to do this within the scope of diffsims (and not add pyxem as a dependancy) but I feel like that would be a lot of recreating many of the features already in pyxem/hyperspy.

@din14970 and @hakonanes you might have insights into this as well.

din14970 commented 3 years ago

Thanks for pinging me for the discussion @CSSFrancis

For me very simply the scope of diffsims is to easily and flexibly calculate kinematical diffraction intensities. An expanded scope may be including dynamical diffraction in the calculation. I don't think multislice simulations for e.g. CBED patterns are in the scope, but I'm open to the discussion. Overall, the output of the simulation is a list of coordinates in reciprocal space (1D = profile, 2D = ED pattern), their miller coordinates, and a list of associated intensities.

When I started working on diffsims I was also missing this quick and dirty visualization functionality and added the plot method (I first added it as a function and @pc494 moved it to a more appropriate location) https://github.com/pyxem/diffsims/blob/b239a7e83ee38751fdc58c9b9797ab58aeef4567/diffsims/sims/diffraction_simulation.py#L217, so perhaps you can work from this. For a couple figures I made, I also wanted the option for labels to create something like this: image

Electron spot patterns including diffraction spot labels

So if this is what you had in mind I can share the code for the image above, perhaps it can be added into the plot method with another kwarg. Modifications may be necessary to deal with hexagonal indices though. What one must keep in mind is that this is somewhat slow and inefficient: the way it is done is looping over all the spots with a python for loop and adding an axis.text for each if the intensity of the spot is above a certain threshold. I don't know if there is a better way to do it with python though. For a single pattern it doesn't matter much but if you want speed it may start being painful for complex patterns.

Electron profile data including labels for peaks

Because I never really cared much about these I haven't looked into them.

Diffraction Simulations returning DiffractionSignals with proper metadata scales etc. @pc494 what is your opinion on adding in a pyxem dependancy?

In my opinion this is maybe the most contentious issue. Correct me if I'm wrong in my interpretation of your comment. I would not be in favor of a PyXem dependancy firstly because it could make imports circular and probably difficult to maintain. Furthermore, I would not agree with making the diffraction simulation return a DiffractionSignal as per my understanding these structures are meant for N-D datasets, whereas a spot pattern is really a labeled point cloud. Simulations of spot patterns can be abstracted from detector quantization and pixels, all you need to describe them are x-y coordinates in g-space and intensities. Only once you want to compare them to an image do you need a calibration value (pixels/nm^-1). Storing each simulated pattern as a very sparse "image" by default would be extremely wasteful and would break a lot of code.

What I would be in favor of would be improving the ways of converting the spot pattern as it exists now into images that look like more realistic simulations. Currently, there is the get_diffraction_pattern method https://github.com/pyxem/diffsims/blob/b239a7e83ee38751fdc58c9b9797ab58aeef4567/diffsims/sims/diffraction_simulation.py#L174 and I also implemented get_as_mask https://github.com/pyxem/diffsims/blob/b239a7e83ee38751fdc58c9b9797ab58aeef4567/diffsims/sims/diffraction_simulation.py#L128 which are two options for turning the pattern into an image representation for different purposes. In this case I would agree it would be preferable to return hyperspy-like objects with metadata, because scale information in these images is lost.

To avoid Hyperspy/Pyxem dependency, would it not be an option to add functionality in PyXem to accept a DiffractionSimulation object and construct a Diffraction2D object out of it?

Zone Axis Patterns returned with better navigation

Could you clarify what do you mean with this?

hakonanes commented 3 years ago

Thanks for the ping @CSSFrancis!

Brief background on the use of diffsims in kikuchipy: we only use the diffsims.crystallography.ReciprocalLatticePoint to perform geometrical EBSD simulations. It only works for cubic structures, which I plan to improve. I also plan to move the EBSD simulations from kikuchipy to diffsims. I then want to extend the EBSD simulations to band profiles, and perhaps down the road add some intensity profiles to those bands.

Based on this use, I see diffsims as a package to implement all the physics in, while I do all pattern analysis in kikuchipy. In my opinion, this means that diffsims should only, as @din14970 says, return band indices, detector positions, and intensities. Generation of simulated EBSD patterns I can do in kikuchipy. Similarly, I don't want any crystallography or orientation analysis in kikuchipy, as we have orix for this.

I think pyxem using diffsims can be a replacement for JEMS, but not diffsims by itself. diffsims cannot depend on pyxem unfortunately, since pyxem depends on diffsims.

Quick plotting is nice when prototyping a longer workflow, and this is I think one of the greatest strengths of Python. HyperSpy is great at interactive plotting, and its marker functionality, which we use in kikuchipy (see the geometrical EBSD simulation link above), is powerful. I think a workflow in which a simulation is created in diffsims and plotted with pyxem (HyperSpy) is the best option. Therefore, I recommend that visualization code specific to spot pattern simulations should be in pyxem.

pc494 commented 3 years ago

I agree broadly with what has been said so far. My only comments would be

1) I think diffsims is the right place for code that converts 'simulations' (ie. coordinates and intensities) into 'patterns' (ie. the thing you see on the detector) as this allows us to store noise models all in one place

2) I think hyperspy style operations are best going into pyxem where they can take diffsims objects as arguments

hakonanes commented 3 years ago

1) I think diffsims is the right place for code that converts 'simulations' (ie. coordinates and intensities) into 'patterns' (ie. the thing you see on the detector) as this allows us to store noise models all in one place

Yes, I agree that numpy arrays of simulated patterns should be possible to get from diffsims.

CSSFrancis commented 3 years ago

Thanks for all of the responses. I was kind of naive in my thinking, about how to better integrate pyxem and diffsims so I might give it a little more thought on how the two packages can work together.

@din14970

For me very simply the scope of diffsims is to easily and flexibly calculate kinematical diffraction intensities. An expanded scope may be including dynamical diffraction in the calculation. I don't think multislice simulations for e.g. CBED patterns are in the scope, but I'm open to the discussion. Overall, the output of the simulation is a list of coordinates in reciprocal space (1D = profile, 2D = ED pattern), their miller coordinates, and a list of associated intensities.

I think eventually the idea would be to include something like a multislice simulation and a prismatic type simulations (see #6 ) Already there is the framework for kinematic CBED patterns and at least the intention to add in more.

https://github.com/pyxem/diffsims/blob/b239a7e83ee38751fdc58c9b9797ab58aeef4567/diffsims/generators/diffraction_generator.py#L534-L545

For me this is probably what I am most interested in adding. I think that its nice to have the ability to build simple structures simulate their kinematic diffraction and then their dynamical diffraction all in one place.

So if this is what you had in mind I can share the code for the image above, perhaps it can be added into the plot method with another kwarg. Modifications may be necessary to deal with hexagonal indices though.

What one must keep in mind is that this is somewhat slow and inefficient: the way it is done is looping over all the spots with a python for loop and adding an axis.text for each if the intensity of the spot is above a certain threshold. I don't know if there is a better way to do it with python though. For a single pattern it doesn't matter much but if you want speed it may start being painful for complex patterns.

I've actually done something quite similar! I think for the most part speed shouldn't be that big of an issue. If it is mostly for visualization purposes than it doesn't need to be incredibly fast. Maybe just adding that to the quick and dirty plot method would be useful as another argument.

In my opinion this is maybe the most contentious issue. Correct me if I'm wrong in my interpretation of your comment. I would not be in favor of a PyXem dependancy firstly because it could make imports circular and probably difficult to maintain. Furthermore, I would not agree with making the diffraction simulation return a DiffractionSignal as per my understanding these structures are meant for N-D datasets, whereas a spot pattern is really a labeled point cloud. Simulations of spot patterns can be abstracted from detector quantization and pixels, all you need to describe them are x-y coordinates in g-space and intensities. Only once you want to compare them to an image do you need a calibration value (pixels/nm^-1). Storing each simulated pattern as a very sparse "image" by default would be extremely wasteful and would break a lot of code.

What I would be in favor of would be improving the ways of converting the spot pattern as it exists now into images that look like more realistic simulations. Currently, there is the get_diffraction_pattern method https://github.com/pyxem/diffsims/blob/b239a7e83ee38751fdc58c9b9797ab58aeef4567/diffsims/sims/diffraction_simulation.py#L174

and I also implemented get_as_mask https://github.com/pyxem/diffsims/blob/b239a7e83ee38751fdc58c9b9797ab58aeef4567/diffsims/sims/diffraction_simulation.py#L128

which are two options for turning the pattern into an image representation for different purposes. In this case I would agree it would be preferable to return hyperspy-like objects with metadata, because scale information in these images is lost. To avoid Hyperspy/Pyxem dependency, would it not be an option to add functionality in PyXem to accept a DiffractionSimulation object and construct a Diffraction2D object out of it?

Yes I completely argee that the hyperspy/pyxem dependency isn't what we want. I don't think I was thinking things all the way through last night.

Zone Axis Patterns returned with better navigation

Could you clarify what do you mean with this?

I don't think I really did a good job of explaining this and it is mostly because I haven't used this part of diffsims that often and haven't really taken a dive into orix. Mostly this is inspired by a desire to make simulated images like this one from you @din14970 ...

image

Or take it a step further and make a signal that has a navigation signal which is the sterographic project and then a signal axes which shows the diffraction pattern.

Mostly I am just tired of using paid software like JEMS for things like that as it makes it difficult to share with collaborators, use on home computers and teach students with. My adviser teaches a class which uses JEMS heavily that I would like to convert to using pyxem/diffsims/hyperspy (including some detailed jupyter notebooks on how to use diffsims).

What I propose we should do is:

din14970 commented 3 years ago

For me this is probably what I am most interested in adding. I think that its nice to have the ability to build simple structures simulate their kinematic diffraction and then their dynamical diffraction all in one place.

Maybe you are already aware but if full scale diffraction simulations are of interest I think it is probably wisest to take a look at abTEM and see whether it is possible to integrate/create wrappers around that, although I'm wondering whether it is even instructive to do so. Performance is excellent, I'm using this for high throughput multislice simulations on the cluster. Much easier to get it running than prismatic, since it's pure Python (numba + cupy). They also did a very nice job with documentation. It can already simulate pretty much anything with multislice or the prismatic algorithm including 4D-STEM datasets. I think for more advanced simulations in diffsims we will have to introduce concepts of actual cells of atoms with a thickness and size as well as probably add an ASE dependency.

I've actually done something quite similar! I think for the most part speed shouldn't be that big of an issue. If it is mostly for visualization purposes than it doesn't need to be incredibly fast. Maybe just adding that to the quick and dirty plot method would be useful as another argument.

I can try to make a PR soonish

Or take it a step further and make a signal that has a navigation signal which is the sterographic project and then a signal axes which shows the diffraction pattern.

Aha so basically on one side a stereographic plot that is interactive and on the other the simulated pattern. Note that this only works when one simulates a library of patterns based on beam directions with the in-plane angle constrained - in the general case orientations exist on SO(3) which you can't visualize with a stereographic projection. So it would only apply for a specific kind of diffraction pattern library.

What would be generalizable and interactive would be a 3D crystal plot on one side with the ASE ngl viewer or so and the ability to rotate it with the mouse (or set the orientation with some fields), then the simulated pattern plot on the other side and continuously updated. This seems more of a mini-project on its own though.

I think most of the other visualisations that you may be thinking of are in progress in orix, see https://github.com/pyxem/orix/issues/166 or https://github.com/pyxem/orix/pull/158. Some of the code in there I used to create the image you link.

Mostly I am just tired of using paid software like JEMS for things like that as it makes it difficult to share

I agree that frustration is a strong driving force for just making your own stuff. That's why I basically re-implemented the ASTAR indexing algorithm.

Every Simulation class has a to_pyxem() method which creates the axes, metadata and data which creates the DiffractionSignal

If you intend to implement this in diffsims, how do you avoid the hyperspy dependency? In my view it may be cleaner to add stuff in pyxem that just accepts the diffsims simulation objects. The diffsims objects may then have methods that turn its data into an image somehow. What I would perhaps be in favor of is somehow adding more metadata into diffraction simulation objects. For example, she crystal information and orientation do not get stored in there. When simulating an entire library it might be wasteful to do so however.

In general add in some quick plotting abilities and maybe create a more involved visualization tool in pyxem for visualizing sterographic projections etc.

I would also be in favor of more hidden orix and diffsims use directly from pyxem to make visualisations or do calculations, now notebooks get very messy with all the different imports. But I think at this point most of the visualization functionality is quite new and exists mostly thanks to the efforts of @hakonanes. So it will just need some time to mature I think.

CSSFrancis commented 3 years ago

Maybe you are already aware but if full scale diffraction simulations are of interest I think it is probably wisest to take a look at abTEM and see whether it is possible to integrate/create wrappers around that, although I'm wondering whether it is even instructive to do so. Performance is excellent, I'm using this for high throughput multislice simulations on the cluster. Much easier to get it running than prismatic, since it's pure Python (numba + cupy). They also did a very nice job with documentation. It can already simulate pretty much anything with multislice or the prismatic algorithm including 4D-STEM datasets. I think for more advanced simulations in diffsims we will have to introduce concepts of actual cells of atoms with a thickness and size as well as probably add an ASE dependency.

This is actually exactly what I have been looking for, thanks for the link! I'll play around with it a little bit and maybe we can think about adding it or integrating it in some way?

Aha so basically on non_uniform_axesone side a stereographic plot that is interactive and on the other the simulated pattern. Note that this only works when one simulates a library of patterns based on beam directions with the in-plane angle constrained - in the general case orientations exist on SO(3) which you can't visualize with a stereographic projection. So it would only apply for a specific kind of diffraction pattern library.

What would be generalizable and interactive would be a 3D crystal plot on one side with the ASE ngl viewer or so and the ability to rotate it with the mouse (or set the orientation with some fields), then the simulated pattern plot on the other side and continuously updated. This seems more of a mini-project on its own though. I agree that frustration is a strong driving force for just making your own stuff. That's why I basically re-implemented the ASTAR indexing algorithm.

Yea, maybe I'll play around with this as a more GUI or notebook widgets extension of diffsims. I've been meaning to do something like that just to help with teaching.

Every Simulation class has a to_pyxem() method which creates the axes, metadata and data which creates the DiffractionSignal

If you intend to implement this in diffsims, how do you avoid the hyperspy dependency? In my view it may be cleaner to add stuff in pyxem that just accepts the diffsims simulation objects. The diffsims objects may then have methods that turn its data into an image somehow. What I would perhaps be in favor of is somehow adding more metadata into diffraction simulation objects. For example, she crystal information and orientation do not get stored in there. When simulating an entire library it might be wasteful to do so however.

Sorry I don't think I explained that very well. The to_pyxem method would just return the necessary import to build a Signal2D or Signal1D. So it would return a dictionary ie. signal_dict ={"data":data. "axes": axes_dict, "metadata":metadata_dict} and then you could just create a DiffractionSignal DiffractionSignal2D(**signal_dict) or something like that.

I would also be in favor of more hidden orix and diffsims use directly from pyxem to make visualisations or do calculations, now notebooks get very messy with all the different imports. But I think at this point most of the visualization functionality is quite new and exists mostly thanks to the efforts of @hakonanes. So it will just need some time to mature I think.

That is a good point about adding things to pyxem. A more hidden diffsims/orix is probably the way to go as you can fairly easily get lost in the weeds which I think dissuades users. From a higher level perspective it might be interesting to look at https://github.com/hyperspy/hyperspy/issues/2398 and possibly use or expand that to use vectors as navigation signals.

CSSFrancis commented 3 years ago

After looking through ASE as little more I wonder if it would be a good dependency. Could it be used to replace some of the places where diffpy is used (and maybe completely replace it?)

I think diffpy does a lot of things well but there is a little bit of a learning curve when using it and it isn't very actively supported/ the documentation is patchy at best...

We could fairly easily slot in ASE and abTEM and I think it would add in a fair bit of functionality but it does beg the question are we actually providing any help to a user or are we just making things more unnecessarily complicated.

Maybe a better option would be to see if we can load and integrate abTEM outputs effectly into hyperspy/pyxem and then link to those libraries as options for dynamical simulations.

din14970 commented 3 years ago

After looking through ASE as little more I wonder if it would be a good dependency. Could it be used to replace some of the places where diffpy is used (and maybe completely replace it?)

As far as I've seen it's mainly used to load CIF files and if I'm not mistaken also to construct reciprocal space grids. ASE is probably a more reliable library indeed as it's used very intensively in the atomistics community. So if ASE can completely replace diffpy it makes sense to do it.

We could fairly easily slot in ASE and abTEM

I think ASE make things easier, also to visualise the crystal in the jupyter notebook. I'm not sure whether slotting in abTEM will do anything - it will make diffsims more complicated, larger and I'm not sure why one would need a wrapper around an already quite easy to use package. Again, since with diffsims the focus is on spot patterns or lists of reflections but abTEM produces images, I would wager PyXem would benefit more from an abTEM dependency in order to simulate + analyze 4D-STEM datasets for example.

Maybe a better option would be to see if we can load and integrate abTEM outputs effectly into hyperspy/pyxem and then link to those libraries as options for dynamical simulations.

I once had a chat with the creator where I urged him to make the outputs of abTEM hyperspy compliant. I don't think it would be such a big challenge because he outputs the data in hdf5 format anyway. He was receptive to the idea but not sure how far along he is on it or whether he is working on it at all. But basically if he would output .hspy files with all the right metadata this would already be a great step in the right direction - anyway these simulations tend to take a long time so you usually want to write to disk before you do anything else.

CSSFrancis commented 3 years ago

As far as I've seen it's mainly used to load CIF files and if I'm not mistaken also to construct reciprocal space grids. ASE is probably a more reliable library indeed as it's used very intensively in the atomistics community. So if ASE can completely replace diffpy it makes sense to do it.

@pc494 would know more here...

I think ASE make things easier, also to visualise the crystal in the jupyter notebook. I'm not sure whether slotting in abTEM will do anything - it will make diffsims more complicated, larger and I'm not sure why one would need a wrapper around an already quite easy to use package. Again, since with diffsims the focus is on spot patterns or lists of reflections but abTEM produces images, I would wager PyXem would benefit more from an abTEM dependency in order to simulate + analyze 4D-STEM datasets for example.

I'm not sure. On one hand it might be nice to look at a spot pattern and then move towards looking at a CBED kinematic simulations and then dynamic simulations. On the other hand we could try to focus diffsims more on spot patterns and leave the other stuff to other packages.

I once had a chat with the creator where I urged him to make the outputs of abTEM hyperspy compliant. I don't think it would be such a big challenge because he outputs the data in hdf5 format anyway. He was receptive to the idea but not sure how far along he is on it or whether he is working on it at all. But basically if he would output .hspy files with all the right metadata this would already be a great step in the right direction - anyway these simulations tend to take a long time so you usually want to write to disk before you do anything else.

I guess the other thing to do would be just to offer to them that we could add in hyperspy compatibility, I've done it a couple of times and it isn't terribly difficult. It just is kind of a large learning curve so it might be easier for me to do.

hakonanes commented 3 years ago

I think diffpy does a lot of things well but there is a little bit of a learning curve when using it and it isn't very actively supported/ the documentation is patchy at best...

In orix we have the Phase class which stores a diffpy.structure Structure (with a lattice and atoms) together with the crystal symmetry (space group and point group). I find the diffpy.structure API easy to navigate, and so far I haven't encountered a use case where the package didn't provide what I wanted, apart from the direct structure matrix, which I've had to implement myself in orix and diffsims, but plan to make a PR to diffpy.structure to include there.

What does ASE offer that you need that diffpy.structure don't, @CSSFrancis? I agree that the package isn't optimal, but it is light weight, and I find their documentation and code easy to navigate.

pc494 commented 3 years ago

Atom packages are nuisance. We started using pymatgen and had to drop it in favour of diffpy. I would be cautious about rewriting too much as could quickly spiral into a huge project

CSSFrancis commented 3 years ago

What does ASE offer that you need that diffpy.structure don't, @CSSFrancis? I agree that the package isn't optimal, but it is light weight, and I find their documentation and code easy to navigate.

Sorry I was maybe projecting some of my own frustrations on diffpy. It works very well for what we are doing, defining crystal structures, reading cif files, getting the reciprocal space lattice points etc. From building a modeling more complicated structures (in my case building glass structures) it doesn't always work nicely.

Atom packages are nuisance. We started using pymatgen and had to drop it in favour of diffpy. I would be cautious about rewriting too much as could quickly spiral into a huge project

I would definitely agree with @pc494 though! I think that there are definitely some better things to spend time on rather than making a third switch :) but it was just something to think about. My only hesitation with diffpy is that it doesn't seem like it is widely used/developed which might cause issues down the road. On the other hand ase seems like it has a very active and healthy group of developers.

From my perspective I think this goes back to the question of: Do we want to support cbed pattern simulations/4-D STEM vs are we just focused on spot patterns? If we want to support 4-D STEM simulations patterns, then using ASE along side diffpy is probably the best solution.

pc494 commented 3 years ago

From my perspective I think this goes back to the question of: Do we want to support cbed pattern simulations/4-D STEM vs are we just focused on spot patterns? If we want to support 4-D STEM simulations patterns, then using ASE along side diffpy is probably the best solution.

The simple answer is that it's a question of developer time. I have found that dynamic simulation are fiddly enough that if I want them I will do them myself (ie. install the software etc myself) and I fear that maintaining an "easy to use" version would be a lot on an already thin developer base. That said ASE seems fairly lightweight (in terms of dependencies) so if you are interested in adding functionality that depends on it I'm not going to complain.

din14970 commented 2 years ago

Recently a big company in the TEM instrumentation industry commissioned me to create an improved version of my alphabeta software that also includes kinematical diffraction pattern simulations. To avoid GPL conflicts I implemented everything from scratch, using primarily numpy-quaternion for dealing with orientations/rotations and ASE for atomic model integration. The project gave me some interesting ideas that I think could be useful to diffsims; I would implement when I find time:

hakonanes commented 2 years ago

Recently a big company in the TEM instrumentation industry commissioned me to create an improved version of my alphabeta software that also includes kinematical diffraction pattern simulations.

Cool! Congratulations.

include the concept of a physical planar detector that can be tilted/rotated from the absolute x-y plane and is fixed at some working distance from the crystal (=|K0| for closest matching calibration). If diffraction simulations also store K (= K0 + G) beams, projection of patterns onto the detector can be made more physically accurate. Tilting the beam will then also shift the pattern on the detector.

This is exactly what we have in kikuchipy.detectors.EBSDDetector for projecting parts of a square Lambert projection of the Kikuchi sphere, simulated with for example EMsoft, onto a 2D EBSD/TKD detector. The projection we have implemented is from the Lambert projection to the gnomonic projection based on the one in EMsoft. All possible flat detector shapes (rows, columns), and position relative to the sample should be supported (see figure in docs).

I don't know the simplest way to generalize the EBSDDetector to work with simulations for TEM in diffsims, but could give pointers to anyone who would want to try.

hakonanes commented 2 years ago

ASE for dealing with atomic models - it seems to be the de-facto standard and improves interoperability with other packages

This point is discussed in https://github.com/pyxem/orix/issues/270.

For the uninitiated: diffsims uses diffpy.structure.Structure for electron diffraction simulations where ASE could be a replacement. kikuchipy uses Structure indirectly via diffsims.crystallography.ReciprocalLatticePoint via orix.crystal_map.Phase (which stores point group and space group in addition to Structure) for geometrical (Kikuchi band position) EBSD simulations. In the (long distant?) future, I think diffsims should use Phase from orix as well.