How to ensure consistent indices when reading from POSCAR?

pyiron / pyiron_atomistics

pyiron_atomistics - an integrated development environment (IDE) for atomistic simulation in computational materials science.

https://pyiron-atomistics.readthedocs.io

BSD 3-Clause "New" or "Revised" License

44 stars 15 forks source link

How to ensure consistent indices when reading from POSCAR? #335

Closed pmrv closed 3 years ago

pmrv commented 3 years ago

I'm reading a number of POSCAR files to run some lammps on it. The issue is that even though the elements are always defined in the same order in the POSCAR, the association index <-> element symbols is seemingly random, when I read the files with either pr.create.structure.ase.read or pyiron_atomistics.vasp.structure.read_atoms.

If this is coming from ASE then I suppose there's not a lot we can do about it, but is there a way to change what indices are associated with what symbol after an Atoms instance is created? I've tried setting atoms.indices, but this changes the the symbol of that changed atom accordingly (as it should).

liamhuber commented 3 years ago

I think I ran into this and hacked my way around it: I had a binary in which I knew what species I should have more of and did something like

Load structure
Find ids matching lesser indices with np.unique returning counts
Map the entire structure to the major species
Slice the structure with the previously logged ids and map these to the minor species

Definitely not a real solution, but a fast workaround under the right conditions.

sudarsan-surendralal commented 3 years ago

@pmrv can you add a small example POSCAR file that you'd like to parse over here?

pmrv commented 3 years ago

POSCAR file written by Ovito Basic 3.0.0
   1.00000000000000     
    12.4524250200999997    0.0000000000000000    0.0000157408058710
    -3.1131062549999999    5.3920451819000004   -0.0000039352014677
     0.0000530996979940    0.0000000000000000   40.4697188976000035
   Ca   Mg   Al
    1    1    1
Cartesian
6.2262054545 8.8300000128e-06 1.9039669849
0.3113441054 0.5391959194 18.3309003343
3.1131124744 5.3920363513 12.0213927741

If you then run

s = pyiron_atomistics.vasp.structure.read_atoms(f'POSCAR')
print(nx, ny, set(list(zip(s.indices, s.get_chemical_symbols()))))

you'll find that it changes every time you run it.

EDIT: added full path

pmrv commented 3 years ago

I've found this snippet in some of @jan-janssen old code for the Mlip classes

original_dict = {el: ind for ind, el in enumerate(sorted(ham['input/structure/species']))}
species_dict = {ind: original_dict[el] for ind, el in enumerate(ham['input/structure/species'])}
...
new_indices =  [species_dict[el] for el in ham['input/structure/indices']]

That basically reorders the indices to follow the lexical sorting of the chemical symbols. Maybe we should do that generally on initializing a Atoms object?

EDIT: This is slightly shorter and easier to read[citation needed]

index_map = np.argsort(np.argsort(ham['input/structure/species']))
new_indices =  index_map[ham['input/structure/indices']]