materialsproject / pymatgen

Python Materials Genomics (pymatgen) is a robust materials analysis code that defines classes for structures and molecules with support for many electronic structure codes. It powers the Materials Project.
https://pymatgen.org
Other
1.48k stars 850 forks source link

major issue in get_primitive_standard_structure() #321

Closed computron closed 8 years ago

computron commented 8 years ago

System

The structure changes very drastically upon calling "get_primitive_standard_structure".

Example code

from pymatgen.analysis.structure_matcher import StructureMatcher
from pymatgen.io.vasp import Poscar
from pymatgen.symmetry.analyzer import SpacegroupAnalyzer

old_structure = Poscar.from_file("CONTCAR.gz").structure
sym_finder = SpacegroupAnalyzer(old_structure)
new_structure = sym_finder.get_primitive_standard_structure()

# easy sanity check
print("Is the volume per atom the same in old and new structures?")
print(old_structure.volume/old_structure.num_sites, new_structure.volume/new_structure.num_sites)
print("No! Clearly these are very different structures...")
print("")
print("Does the structurematcher think these structures are the same?")
sm = StructureMatcher()
print(sm.fit(old_structure, new_structure))

Error message

This is what is printed out when running the above code:

Is the volume per atom the same in old and new structures?
(10.64341518083199, 4.561477853966009)
No! Clearly these are very different structures...

Does the structurematcher think these structures are the same?
False

Suggested solution (if any)

I didn't debug the underlying code, but this is probably the 2nd or 3rd time there has been a problem in this method. I would strongly recommend adding an assert statement at the end of the code that asserts that volume per atom is the same in the original and final structures. This should not affect performance and will help us find errors. In this case, I thought the error was stemming from a new workflow code but it looks like the culprit is probably this structure change. It would have been much easier debug if I had seen an assertion failure at this step.

Files (if any)

CONTCAR.gz

computron commented 8 years ago

@hautierg can you look into this? We are using this code for tens of thousands of band structures, and a structure change would obviously be disastrous. In the case I posted, the new primitive standard structure even has positive energy even though the initial structure is correct.

computron commented 8 years ago

Also, the error appears to be somewhere between line 454-470 of analyzer.py.

i.e., starting with conventational_standard_structure() and refined_structure() is correct. But modifying the refined_structure() has an error somewhere.

computron commented 8 years ago

A further hint -

Probably the transformation needs to take into account the 2nd lattice vector length

hautierg commented 8 years ago

OK. I have looked more into this.

I think the problem comes from the refined structure for a rhombohedral cell being returned as hexagonal settings. In fact, while for cubic systems the conventional cell is clearly and uniquely defined, the rhombohedral systems can have both a rhombohedral or hexagonal representation. As far as I understood the IUCr did not choose one (see table of crystallography p. 14-15 http://ahrenkiel.sdsmt.edu/courses/Spring2016/NANO704/International_Tables_For_Crystallography_A.pdf http://ahrenkiel.sdsmt.edu/courses/Spring2016/NANO704/International_Tables_For_Crystallography_A.pdf)

In itself, this is fine. It is just a definition from spglib. BUT the convention we took (for band structures) from Setyawan assumes a rhombohedral (and not hexagonal) representation. The mess comes from the code expecting a rhombohedral cell and getting an hexagonal one.

If it had been like this directly when we coded this, we would have seen it. In fact, we have some rhombohedral runs in the MP data that look perfectly fine (meaning their primitive standard, static run is exactly what we expect). So, my guess is that the convention to give the hexagonal cell changed with time in spglib. Although I cannot formally prove it this seems the most likely explanation at the moment.

What I suggest to do:

-keep the conventional to be the hexagonal in our crystal symmetry analysis package as in spglib -change the code generating primitive standard cell to take this into account and to solve the mismatch we have now. This should be pretty easy. -add some checks as you suggested on volume for instance

Follow-up on how this affect the MP.

Best,

Geoffroy

On 08 Apr 2016, at 19:45, Anubhav Jain notifications@github.com wrote:

A further hint -

The code starts with the refined structure (correct) and makes further changes. The changes attempt to modify the refined lattice matrix into a new setting. But, this modification code seems to only use a single lattice vector length of the old setting in the calculation of the new matrix. Thus, the new lattice only has a single lattice vector length. But the rhombohedral cell has 2 unique lattice vector lengths. It looks like this is not preserved... Probably the transformation needs to take into account the 2nd lattice vector length

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/materialsproject/pymatgen/issues/321#issuecomment-207533707

wenhaosun commented 8 years ago

Hi all,

I'm not sure if this is related, but I have witnessed a very strange error on the MaterialsProject when doing the primitive unit cell function on some rhombohedral unit cells. Consider the following:

https://materialsproject.org/materials/mp-674478/

The conventional standard shows a rhombohedral R-3c structure. But if you click on Primitive, it returns a cubic structure, with completely messed up internal coordinates.

I showed this to Shyue Ping a while back and he tried to reproduce in Pymatgen, but got the correct primitive cell. I didn't pursue the issue, but perhaps it is related to the bug discussed here.

Best, Wenhao Sun

On Sun, Apr 10, 2016 at 2:09 PM, Geoffroy Hautier notifications@github.com wrote:

OK. I have looked more into this.

I think the problem comes from the refined structure for a rhombohedral cell being returned as hexagonal settings. In fact, while for cubic systems the conventional cell is clearly and uniquely defined, the rhombohedral systems can have both a rhombohedral or hexagonal representation. As far as I understood the IUCr did not choose one (see table of crystallography p. 14-15 http://ahrenkiel.sdsmt.edu/courses/Spring2016/NANO704/International_Tables_For_Crystallography_A.pdf < http://ahrenkiel.sdsmt.edu/courses/Spring2016/NANO704/International_Tables_For_Crystallography_A.pdf

)

In itself, this is fine. It is just a definition from spglib. BUT the convention we took (for band structures) from Setyawan assumes a rhombohedral (and not hexagonal) representation. The mess comes from the code expecting a rhombohedral cell and getting an hexagonal one.

If it had been like this directly when we coded this, we would have seen it. In fact, we have some rhombohedral runs in the MP data that look perfectly fine (meaning their primitive standard, static run is exactly what we expect). So, my guess is that the convention to give the hexagonal cell changed with time in spglib. Although I cannot formally prove it this seems the most likely explanation at the moment.

What I suggest to do:

-keep the conventional to be the hexagonal in our crystal symmetry analysis package as in spglib -change the code generating primitive standard cell to take this into account and to solve the mismatch we have now. This should be pretty easy. -add some checks as you suggested on volume for instance

Follow-up on how this affect the MP.

Best,

Geoffroy

On 08 Apr 2016, at 19:45, Anubhav Jain notifications@github.com wrote:

A further hint -

The code starts with the refined structure (correct) and makes further changes. The changes attempt to modify the refined lattice matrix into a new setting. But, this modification code seems to only use a single lattice vector length of the old setting in the calculation of the new matrix. Thus, the new lattice only has a single lattice vector length. But the rhombohedral cell has 2 unique lattice vector lengths. It looks like this is not preserved... Probably the transformation needs to take into account the 2nd lattice vector length

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub < https://github.com/materialsproject/pymatgen/issues/321#issuecomment-207533707

— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub https://github.com/materialsproject/pymatgen/issues/321#issuecomment-208071686

Ceder Research Group http://web.mit.edu/ceder/ | Massachusetts Institute of Technology Department of Materials Science and Engineering 77 Massachusetts Ave | Room 13-5065 | Cambridge, MA, 02139

shyuep commented 8 years ago

I would recommend a more robust solution that assuming either rhom or hex setting. In general, the code should validate what latttice it is getting, before standardizing it. If it is in rhom, it applies a rhom conversion. If it is a hex, it applies a hex conversion. Validating isn't difficult since it is pretty clear from the lattice lengths and angles.

In fact, the rest of the cells should have a similar validation.

hpleva commented 8 years ago

For me version 3.0.13 works as expected. The bug is introduced in the next version 3.1.0, which is where spglib was updated to version 1.7.3.