Making `unique_shifts` more robust

lan496 commented 2 years ago

@lmhale99 This is ported PR from https://github.com/lmhale99/atomman/pull/28 because I do not have permission to add commits to it. After I have tested current unique_shifts with fcc, bcc, and hcp prototypes in [1], I have noticed the below line in FreeSurface.__init__ drops the precision of atomic coordinates and it causes the detection failures with a default tolerance atol. https://github.com/lmhale99/atomman/blob/3bf08cefb423908f40cac343240dd90c1fb924c9/atomman/defect/FreeSurface.py#L102

So, this PR fixes to keep the precision of atomic coordinates and I have checked outputs with [1]. Now, unique_shifts correctly detects shifts for fcc, bcc, and hcp.

Also, spglib.standardize_cell automatically rotates a given cell, so I think we cannot use it here. For now, I go back to use dataset['primitive_lattice']... Maybe we can use dataset[’std_rotation_matrix’] to rotate standardized cell, instead?

I have question about the surface structures in [1]. Are they tabulated in some textbooks or articles? They are amazing because I cannot find any curated list of distinct surface structures for famous prototype structures.

Though, I guess some surface structures are missing in [1]. For example, unique_shifts returns two shifts for (322) surface of A4 structure, but there is only one structure in [1]. The below figures are the obtained surface structures: Because the z-axis coordinates of two red atoms in the left is different from others, I think there are two nonequivalent surfaces in this case.

Code for reproducing the above example

```python import atomman as am from atomman.defect import FreeSurface if __name__ == '__main__': prototype_id = 'A4--C--dc' a = 4 vacuumwidth = 2 * a hkl = [3, 2, 2] ucell = am.load("prototype", id=prototype_id, a=a) hkl_str = "".join(map(str, hkl)) surface = FreeSurface(hkl, ucell) unique_shifts = surface.unique_shifts(trial_image_range=1) print(hkl, len(surface.shifts), len(unique_shifts)) for i, shift in enumerate(unique_shifts): srf = surface.surface(shift, vacuumwidth=vacuumwidth) with open(f"{prototype_id}--{hkl_str}--{i}.poscar", 'w') as f: f.write(srf.dump('poscar')) ```

[1] https://github.com/lmhale99/potentials-library/tree/master/free_surface

lan496 commented 2 years ago

In some situations, two unique shifts appear in the same slab, which is periodic along one axis. For example, there are two (0001) planes for A3'(alpha-La): In the left figure, the upper surface is occupied by La(2a). In the right figure, the upper surface is occupied by La(2c). Thus, if we ignore the bottom terminations in the slab models, they are distinct. But, if we treat c-axis as periodic, these two structures are equivalent up to a mirror operation not belonging to the space group of A3'.

How should unique_shifts handle such a case? I am considering two options for now:

returning both shifts
returning only one of them and also retuning number of unique surfaces in the corresponding slab

Code to reproduce the figures

```python import atomman as am from atomman.defect import FreeSurface from pymatgen.analysis.structure_matcher import StructureMatcher def get_alpha_La(): prototype_id = "A3'--alpha-La--double-hcp" symbols = 'La' a = 3.77 c = 12.13 ucell = am.load("prototype", id=prototype_id, symbols=symbols, a=a, c=c) return ucell, prototype_id if __name__ == '__main__': ucell, prototype_id = get_alpha_La() hkl = [0, 0, 0, 1] print(prototype_id) hkl_str = "".join(map(str, hkl)) surface = FreeSurface(hkl, ucell) unique_shifts = surface.unique_shifts() print(hkl, len(surface.shifts), len(unique_shifts)) structures = [] for i, shift in enumerate(unique_shifts): srf = surface.surface(shift, vacuumwidth=10) structures.append(srf.dump('pymatgen_Structure')) with open(f"{prototype_id}--{hkl_str}--{i}.poscar", 'w') as f: f.write(srf.dump('poscar')) stm = StructureMatcher() if len(stm.group_structures(structures)) != len(structures): print("[Failed] Something went wrong!") else: print("[Accepted] Detecting unique structures!") ```

lmhale99 commented 2 years ago

@lan496,

Sorry about taking so long to get back to this. So much to do. But the good news is the newest version of atomman is out.

I updated my conda environment altogether, which let me finally get a recent spglib version with 'primitive_lattice' being included in the dataset. This, along with the rounding change you made is giving me more reasonable unique shift counts.

The free surface database was one that I created manually based on looking at atomic spacings and visually looking at the structures. I would expect that discrepancies in the manual database likely lean more towards me missing a surface rather than having extras. The fact that the unique_shifts algorithm is returning slightly higher values is what I would expect. I vaguely remember that I might have noticed the diamond cubic difference, but maybe I threw one out because they were so close? Or I was just getting lazy at that point.

I made a few minor changes:

I renamed rotated_lattice to rotated_vects just for consistency.
I added a try/except around dataset['primitive_lattice'], which uses rotated_vects if it fails. This should make it compatible with spglib < 1.15 as I was seeing the same results using the rotated cell rather than the primitive.
I deleted my bad attempt...

Based on what is done here, it is probably okay for there to be some remaining duplicate structures that are beyond what the symmetry operations can find like in the case of alpha-La. I guess the question is can these be identified easily using the current method? If so, maybe use a bool parameter to turn the extra checks on/off? It seems like they are not technically the same structure but would return the same energy?

lan496 commented 2 years ago

I have added test cases for A1, A2, A3, A4, A5, A6, A7, and Ah prototypes. Without your manual database, I would not notice these rounding changes. Thanks!

I guess the question is can these be identified easily using the current method? If so, maybe use a bool parameter to turn the extra checks on/off? It seems like they are not technically the same structure but would return the same energy?

Yes, as slab models, these structures will return the same energy. I think it is difficult for the current method to identify these structures up to slab models because it depends on not only the symmetry of the original prototype but also how to create slab models in practice. This situation occurs at A15 (beta-W) and A3' (alpha-La) prototypes. In practice, if we want to remove the same-energy structures, we still need to call pymatgen's StructureMatcher after calling FreeSurface.surface for all shifts by FreeSurface.unique_shifts...(though unique_shifts reduces the number of shifts to be checked)

lmhale99 commented 2 years ago

@lan496,

Looks good. I made some more minor tweaks.

I reversed the search order so that the low index shifts are in the unique_shifts rather than the high index shifts.
A return_indices option is now included that also returns the indices of shifts that corresponds to unique_shifts.
My linter was complaining about the "any" function, so I broke it out. This could be slightly faster as it limits the checks performed for that step.

Changing the order and returning indices makes it easier to fix and extend the free surface database as I started with the low indices first for that.

Honestly, the structure matcher thing is beyond what this algorithm is doing so I'm not that concerned. When I add the method to the documentation I'll be sure to mention that the shifts may still produce some energetically equivalent free surface configurations.

Can you check that reversing the operations didn't accidentally break something? I'm getting the same numbers of unique shifts per structure, but you were looking at the structures themselves. If its good, I'll merge

lan496 commented 2 years ago

Looks good to me! I have also confirmed the current unique_shifts passes all test cases.

lmhale99 / atomman

Making `unique_shifts` more robust #29