materialsproject / pymatgen

Python Materials Genomics (pymatgen) is a robust materials analysis code that defines classes for structures and molecules with support for many electronic structure codes. It powers the Materials Project.
https://pymatgen.org
Other
1.51k stars 862 forks source link

Seg fault from SpacegroupAnalyzer on cif file #399

Closed leeaburton closed 8 years ago

leeaburton commented 8 years ago

System

segmentation fault when using SpacegroupAnalyzer on some cif files

Example code

for filename in glob.glob('*.cif'): try: s = Structure.from_file(filename) a = SpacegroupAnalyzer(s) except: pass

Error message

"python quit unexpectedly" or "Segmentation fault: 11"

Suggested solution (if any)

previous version of pymatgen exited with helpful message/warning such as 'site occupation > 1' etc.. but current version kills job mid run and gives no indication. Unfortunately not been able to identify the problem

Files (if any)

Just one of several identified culprit cif files:

data_245140-ICSD
#c2014 by Fachinformationszentrum Karlsruhe, and the U.S. Secretary of 
#Commerce on behalf of the United States.  All rights reserved.
_database_code_ICSD                245140
_audit_creation_date               2007/08/01
_chemical_name_systematic
;
Chromium Selenide Telluride (5.08/2/6)
;
_chemical_formula_structural       'Cr5.08 Se2 Te6'
_chemical_formula_sum              'Cr5.08 Se2 Te6'
_publ_section_title
;
Anion substitution effects on structure and magnetism of the chromium
chalcogenide Cr5 Te8 - Part II: Cluster-glass and spin-glass behavior
in trigonal Cr(1+x) Q2 with basic cells and trigonal Cr(5+x) Q8 with
superstructures (Q = Te, Se; Te:Se = 6:2)
;
loop_
_citation_id
_citation_journal_abbrev
_citation_year
_citation_journal_volume
_citation_journal_issue
_citation_page_first
_citation_page_last
_citation_journal_id_ASTM
primary 'Journal of Solid State Chemistry' 2006 179 7 2067 2078 JSSCBI
_publ_author_name
;
Huang Zhongle;Bensch, W.;Mankovsky, S.;Polesya, S.;Ebert, H.;Kremer,
R.K.
;
_cell_length_a                     3.8226(1)
_cell_length_b                     3.8226(1)
_cell_length_c                     5.9793(2)
_cell_angle_alpha                  90.
_cell_angle_beta                   90.
_cell_angle_gamma                  120.
_cell_volume                       75.67
_cell_formula_units_Z              2
_symmetry_space_group_name_H-M     'P -3 m 1'
_symmetry_Int_Tables_number        164
_refine_ls_R_factor_all            0.0467
loop_
_symmetry_equiv_pos_site_id
_symmetry_equiv_pos_as_xyz
  1     'x-y, -y, -z'
  2     '-x, -x+y, -z'
  3     'y, x, -z'
  4     'x-y, x, -z'
  5     'y, -x+y, -z'
  6     '-x, -y, -z'
  7     '-x+y, y, z'
  8     'x, x-y, z'
  9     '-y, -x, z'
 10     '-x+y, -x, z'
 11     '-y, x-y, z'
 12     'x, y, z'
loop_
_atom_type_symbol
_atom_type_oxidation_number
Cr0+    0
Se0+    0
Te0+    0
loop_
_atom_site_label
_atom_site_type_symbol
_atom_site_symmetry_multiplicity
_atom_site_Wyckoff_symbol
_atom_site_fract_x
_atom_site_fract_y
_atom_site_fract_z
_atom_site_occupancy
_atom_site_attached_hydrogens
_atom_site_B_iso_or_equiv
Cr1 Cr0+ 1 b 0 0 0.5 0.576(12) 0 4.79(5)
Cr2 Cr0+ 6 i 0.5022(6) 0.4978(6) 0.2515(13) 1. 0 4.79(5)
Cr3 Cr0+ 2 c 0 0 0.2480(16) 1. 0 4.79(5)
Cr4 Cr0+ 3 e 0 0.5 0 0.528(12) 0 4.79(5)
Te1 Te0+ 2 d 0.3333 0.6667 0.3902(7) 0.75 0 3.56(2)
Te2 Te0+ 2 d 0.3333 0.6667 0.1254(7) 0.75 0 3.56(2)
Te3 Te0+ 6 i 0.1617(4) 0.8383(4) 0.1208(5) 0.75 0 3.56(2)
Te4 Te0+ 6 i 0.8315(4) 0.1685(4) 0.3695(5) 0.75 0 3.56(2)
Se1 Se0+ 2 d 0.3333 0.6667 0.3903(7) 0.25 0 3.56(2)
Se2 Se0+ 2 d 0.3333 0.6667 0.1254(7) 0.25 0 3.56(2)
Se3 Se0+ 6 i 0.1617(4) 0.8382(4) 0.1208(6) 0.25 0 3.56(2)
Se4 Se0+ 6 i 0.8315(4) 0.1685(4) 0.3695(5) 0.25 0 3.56(2)
shyuep commented 8 years ago

For this problem, just add a structure.merge_sites() before determiing symmetry.

shyuep commented 8 years ago

The reason why we don't do it by default is that it is a very expensive operatino to check validity of structures. The user should be responsible for checking structure validity when working with such files.