google-deepmind / materials_discovery

Apache License 2.0
838 stars 131 forks source link

Two CIFs cannot be converted to pymatgen objects #12

Open esoteric-ephemera opened 6 months ago

esoteric-ephemera commented 6 months ago

A very minor issue, but two CIFs with IDs = 0e2d8f26d6, cdc06a1a2a cannot be converted to pymatgen Structure objects as of pymatgen==2023.12.18.

They're still parsable with pymatgen.io.cif.CifParser, but trying to call get_structures on the CifParser object, or directly calling pymatgen.core.Structure.from_file on these CIFs throws ValueError: Invalid CIF file with no structures!

ml-evs commented 5 months ago

I also ran into these: they correspond to structures that have overlapping atoms in the CIF (and thus incorrect presented formulae) and pathological energies. See e.g., https://optimade-gnome.odbx.science/v1/structures?filter=id=%22data/gnome_data/by_id.zip/data/gnome_data/by_id/0e2d8f26d6.CIF%22

esoteric-ephemera commented 5 months ago

Thanks @ml-evs! Any chance you've noticed other CIFs with overlapping atomic sites?

ml-evs commented 5 months ago

Not off-hand, but if you filter for things with unrealistic formation energies (like lower than -10 eV/atom) you'll probably find some: https://optimade-gnome.odbx.science/v1/structures?filter=_gnome_formation_energy_per_atom<-8