pyxem / orix

Analysing crystal orientations and symmetry in Python
https://orix.readthedocs.io
GNU General Public License v3.0
78 stars 45 forks source link

Replace diffpy.structure with ASE (Atomic Simulation Environment)? #270

Open hakonanes opened 2 years ago

hakonanes commented 2 years ago

On a tangentially related note now that I see it again, it would be handy to switch to using ase instead of diffpy in my opinion.

Originally posted by @din14970 in https://github.com/pyxem/pyxem/issues/794#issuecomment-1017700211

pc494 commented 2 years ago

At the risk of putting my foot in it, why?

update: Ah, the various opening and closing got me confused (thought this was in diffsims), I'm more receptive to such a refactor in orix than I would be in diffsims.

hakonanes commented 2 years ago

Thanks for mentioning this, @din14970. If you have the time, could you show how to create this Al6Mn structure with ASE, which I'm here doing with diffpy.structure:

>>> from diffpy.structure import Atom, Lattice, Structure
>>> lattice = Lattice(0.75551, 0.64994, 0.88724, 90, 90, 90)
>>> debye_waller_factor = 0.005
>>> atom_list = [
...     Atom("Al", (0.32602, 0, 0), occupancy=1, Uisoequiv=debye_waller_factor),
...     Atom("Al", (0, 0.13917, 0.10039), occupancy=1, Uisoequiv=debye_waller_factor),
...     Atom("Al", (0.31768, 0.28622, 0.25), occupancy=1, Uisoequiv=debye_waller_factor),
...     Atom("Mn", (0, 0.45686, 0.25), occupancy=1, Uisoequiv=debye_waller_factor),
... ]
>>> structure = Structure(atoms=atom_list, lattice=lattice)
>>> print(structure)
lattice=Lattice(a=0.75551, b=0.64994, c=0.88724, alpha=90, beta=90, gamma=90)
Al   0.326020 0.000000 0.000000 1.0000
Al   0.000000 0.139170 0.100390 1.0000
Al   0.317680 0.286220 0.250000 1.0000
Mn   0.000000 0.456860 0.250000 1.0000

orix uses this structure attached to the orix.crystal_map.Phase.structure attribute. The atoms aren't used in orix at the moment, just the structure.lattice. Other uses of Phase.structure that I'm aware of (wrote myself):

If ASE can accomodate these things while adding something substantial, like good visualization, I'd be interested in looking to switch.

pyxem initially used pymatgen, but switched to diffpy.structure because the former was deemed too dependency heavy. Looking at ASE it seems like it is lightweight, with options to use external calculators for other stuff? I assume the calculators have to be installed separately?

hakonanes commented 2 years ago

Ah, the various opening and closing got me confused (thought this was in diffsims)

Sorry for that!

I'm more receptive to such a refactor in orix than I would be in diffsims.

I'd be as well if there isn't too much ~work~ stuff that needs rewriting downstream (diffsims, kikuchipy, ?) per my above comment.

din14970 commented 2 years ago

could you show how to create this Al6Mn structure with ASE

There is the spacegroup module in ASE for this.

from ase import spacegroup

crystal = spacegroup.crystal(
    symbols=["Al", "Al", "Al", "Mn"],
    basis=[(0.32602, 0, 0), (0, 0.13917, 0.10039), (0.31768, 0.28622, 0.25), (0, 0.45686, 0.25)],
    occupancies=[1, 1, 1, 1],
    spacegroup=63,
    cellpar=[7.5551, 6.4994, 8.8724, 90, 90, 90],
)

Importing of atomic structures from all kinds of formats like CIF is also easy with ASE.

The only information that is not included in the ASE representation are the debye waller factors. There is some support for X-ray diffraction simulation, which includes a model for DWF see https://wiki.fysik.dtu.dk/ase/ase/xrdebye.html?highlight=debye%20waller, but it seems one doesn't have much control over this. One could make the argument though that DWF are not really inherent to the crystal structure, as they depend on external factors like temperature.

hakonanes commented 2 years ago

Thanks, initialization is simpler with ASE than with diffpy.structure, it seems.

I couldn't immediately find in the docs or via inspecting the crystal instance how to compute reciprocal/direct lattice vector lengths and similar operations, like with diffpy.structure.Lattice.rnorm((h, k, l)) etc. Could you give me a hint? These convenience methods are exactly the functionality I want, personally.

din14970 commented 2 years ago

Hmmm, seems indeed not as native to do such things with ASE. I think the main difference is that ASE started from the principle of arbitrary atomic models, not necessarily from crystallography. But what concerns me a bit is that diffpy does not seem like a very active project, whereas ASE is. One way to do it is like this:

import numpy as np
rcell = crystal.cell.reciprocal()
vecs = rcell.cartesian_positions([(1, 1, 2), (2, 3, 4), (1, 0, 0), (1, 1, 0)])
np.linalg.norm(vecs, axis=1)

For real space distances skip the reciprocal().

There is the "lengths" function but this only returns the lenghts of the basis vectors. Perhaps a PR can be made to alleviate the necessity for this extra step.

hakonanes commented 2 years ago

Hmmm, seems indeed not as native to do such things with ASE. I think the main difference is that ASE started from the principle of arbitrary atomic models, not necessarily from crystallography.

Thank you for clarifying. Seems like diffpy.structure offers me what I need in a simpler way than ASE at the moment. Based on this I won't pursue replacing diffpy.structure myself. I won't stand in the way of anyone who wants to, though, as long as current functionality is kept.

But what concerns me a bit is that diffpy does not seem like a very active project, whereas ASE is.

Yes, it seems like diffpy.structure does what users wants it to do. However, the developers respond fast to issues (example) and PRs (example).

Note that the code in the referenced PR isn't part of a release even though it was merged 1.5 years ago. I made the PR because I thought I would use symmetry operations from the space group from diffpy.structure more, but it turned out I only needed operations from orix' point groups. If I implemented something in diffpy.structure and really wanted it released, I'm sure they would make a release for it without too much delay.

harripj commented 2 years ago

Just wanted to add a comment here as I have also been trying both libraries.

Importing of atomic structures from all kinds of formats like CIF is also easy with ASE.

Completely agree, and ASE is useful for creating atomic models etc., but the CIF reading capabilities are great. Although one or two CIF files I have did not open with ASE but did open in diffpy.structure.

I was using the loadStructure command from diffpy and it has problems working with pathlib.Path in my experience as it is doing some strange path manipulation under the hood. This becomes even worse on Windows and the files can be very finicky to open.

That said the lattice basis of the two libraries are different when opening the CIF files. For cubic or even tetragonal cells this isn't a problem (and so I suspect orthorhomic too), but for hexagonal and triclinc cells there is a difference between the two libraries. For example the hexagonal Mg unit cell:

>>> from diffpy.structure import loadStructure
>>> from ase import io as aseio
>>> from pathlib import Path
>>> import numpy as np

>>> cif = Path.cwd().joinpath(r'Mg.cif')
>>> cif.exists()

>>> # need to do some path manip here to get it to open...
>>> struct = loadStructure(str(cif.relative_to(Path.cwd())))
>>> atoms = aseio.read(cif)

>>> # relative atomic positions within the respective cells are fine
>>> np.allclose(atoms.get_scaled_positions(), struct.xyz)
True

>>> atoms.cell.array
array([[ 3.20302773,  0.        ,  0.        ],
       [-1.60151386,  2.77390338,  0.        ],
       [ 0.        ,  0.        ,  5.126691  ]])

>>> struct.lattice.base
array([[ 2.77390338, -1.60151387,  0.        ],
       [ 0.        ,  3.20302773,  0.        ],
       [ 0.        ,  0.        ,  5.126691  ]])

For the triclinic ReS2:

>>> cif = Path.cwd().joinpath(r'ReS2.cif')

>>> # need to do some path manip here to get it to open...
>>> struct = loadStructure(str(cif.relative_to(Path.cwd())))
>>> atoms = aseio.read(cif)

>>> # relative atomic positions within the respective cells are fine
>>> np.allclose(atoms.get_scaled_positions(), struct.xyz)
True

>>> atoms.cell.array
array([[ 6.4191  ,  0.      ,  0.      ],
       [-3.146162,  5.71419 ,  0.      ],
       [-1.701185, -1.186383,  6.732439]])

>>> struct.lattice.base
array([[ 5.37649483, -3.14567548, -1.55012062],
       [ 0.        ,  6.51991329, -0.20256673],
       [ 0.        ,  0.        ,  7.04466251]])

I think it becomes clear that ASE is forcing a parallel x-vector whereas diffpy is forcing a parallel z-vector. Something to be aware of. It would be nice to be able to use either library but they must be made to be consistent first, eg. template generation.

hakonanes commented 2 years ago

I think it becomes clear that ASE is forcing a parallel x-vector whereas diffpy is forcing a parallel z-vector.

This ties in with our discussion on crystal axes alignment in #249, which we should keep in mind.

It would be nice to be able to use either library

Perhaps to get a orix.crystal_map.Phase instance (like Phase.from_ase_atoms() or Phase.from_diffpy_structure()), but not beyond that, I believe. We use diffpy.structure for lattice operations in orix at the moment, and I see this discussion as whether or not ASE should replace diffpy.structure.

harripj commented 2 years ago

Perhaps to get a orix.crystal_map.Phase instance (like Phase.from_ase_atoms() or Phase.from_diffpy_structure()), but not beyond that, I believe. We use diffpy.structure for lattice operations in orix at the moment, and I see this discussion as whether or not ASE should replace diffpy.structure.

I think I was getting ahead of myself for the reaches of orix, but maybe worth noting for diffsims for example.

hakonanes commented 2 years ago

Right, I think I'm the one who only discusses this in context of orix, as I think @din14970's original comment was in terms of pyxem packages in general.

It's good to have this in mind for diffsims, but I think replacing functionality in diffsims with similar function from orix should be looked at first before potentially using ASE.

harripj commented 2 years ago

After @hakonanes implemented #322 it looks like the realigned crystal axes are consistent with ase. For reference:

from diffpy.structure import loadStructure
from orix.crystal_map import Phase
from ase import io as aseio
import numpy as np
import pandas as pd

cif = ['Ni.cif', 'Fe alpha.cif', # cubic
       'Mg.cif', # hexagonal
       'Ni4W.cif', # tetragonal
       'ReS2.cif', # triclinic
       ]

is_same_diffpy_orix_lattice = []
is_same_ase_orix_lattice = []
is_same_diffpy_orix_positions = []
is_same_ase_orix_positions = []

for c in cif:
    d = loadStructure(c)
    o = Phase(structure=d)
    a = aseio.read(c)

    is_same_diffpy_orix_lattice.append(np.allclose(o.structure.lattice.base, d.lattice.base))
    is_same_ase_orix_lattice.append(np.allclose(o.structure.lattice.base, a.cell.array))

    is_same_diffpy_orix_positions.append(np.allclose(o.structure.xyz_cartn, d.xyz_cartn))
    is_same_ase_orix_positions.append(np.allclose(o.structure.xyz_cartn, a.get_positions()))

pd.DataFrame(data={'cif': [c[:-4].split()[0] for c in cif],
                   'diffpy lattice': is_same_diffpy_orix_lattice,
                   'diffpy positions': is_same_diffpy_orix_positions,
                   'ase lattice': is_same_ase_orix_lattice,
                   'ase positions': is_same_ase_orix_positions})
cif diffpy lattice diffpy positions ase lattice ase positions
Ni True True True True
Fe True True True True
Mg False False True True
Ni4W True True True True
ReS2 False False True True
hakonanes commented 2 years ago

This is good to know, and a note stating this should perhaps be added to the crystal reference frame user guide? ASE specifies their crystal axes alignment in ase.geometry.cellpar_to_cell(). This function signature

ase.geometry.cellpar_to_cell(cellpar, ab_normal=(0, 0, 1), a_direction=None)

and this docstring explanation

The returned cell is orientated such that a and b are normal to ab-normal and a is parallel to the projection of a-direction in the a-b plane.

tells me ASE aligns e1 || a and e3 || c* (given by a x b), which is what orix does.

Note that you could have replaced o = Phase(structure=d) with o = Phase.from_cif(c) (docs).

CSSFrancis commented 1 month ago

Just to maybe open up this disucssion again @jacobjma would this be something that might be useful with relation to adding bloc wave simualtions etc to abtem? If there is a motivation for this we might be able to push for refactoring orix to handle this change.

@hakonanes Do you have any thoughts on this?

hakonanes commented 1 month ago

I would be open to such a refactoring.

But, it has to be done with care! Both orix and diffpy.structure are used alongside one another in diffsims and kikuchipy (and now pyxem?), so changes in orix have to account for this. We would need deprecation warnings, updates in downstream packages, and finally removal.

jacobjma commented 1 month ago

Hi,

I am not working in this space, so I have not had any experience with your library, but your documentation is beautiful (:

The recent implementation of Bloch wave in abTEM is mostly finished. Bloch wave simulations only really requires knowing the Bravais crystal centering. This currently have to be provided by the user, otherwise it is assumed to be primitive.

It would be great to have an automated function to detect the crystal centering.

Do you need a fast Bloch wave simulator?

harripj commented 1 month ago

As @hakonanes mentioned, I think this may be a fairly significant task given the dependencies in other pyxem libraries. Is there an immediate gain to be had in making the switch?

Would a possible option here be to create an interface function or class that coverts Phase to eg. ase.Atoms?

CSSFrancis commented 1 month ago

As @hakonanes mentioned, I think this may be a fairly significant task given the dependencies in other pyxem libraries. Is there an immediate gain to be had in making the switch?

I think the main goal would be better integration into abTEM for simulations and better integration with ASE as maybe a more standard library for strucuture discription. You could think of some workflow where given a phase map you could build a 3D atomic model in ASE, run a few steps to minimize the energy of the system and maybe get a good idea of the energetics at the grain boundaries? This paper I think does something similar https://www.sciencedirect.com/science/article/pii/S1359646216300173#bb0090

Personally, most of the use is related to the Rotations and Orientations classes in orix. I guess those classes don't actually use the phase information, only the symmetry of the underlying crystal. Maybe better testing of integration is better rather than trying to force a large switch.

Some things that would be nice:

  1. Sampling of rotations in ASE is a little difficult and it would be nice to be able to use orix's sampling features with ASE. For example we often rotate atom models to simulate different orientations in glasses.
    • This could be done by simpling adding a rotate_quaternion function in ASE and then we could just pass a Rotation object to rotate the simulation.
  2. Bloch wave simulations in abTEM would allow us to do template matching from dynamical sources.
  3. Kinmatic simulatinos in diffsims could be streamlined in such a way that they are very similar in syntax to abTEM. This would help you "drop in" different types of simulations when doing template matching.
  4. ASE has nice visualization tools.

Would a possible option here be to create an interface function or class that coverts Phase to eg. ase.Atoms?

Yea I think that would be the easiest way to go. I might play around with it and see if there is an easy way to slot this in.