dkriegner / xrayutilities

xrayutilities - a package with useful scripts for X-ray diffraction physicists
http://xrayutilities.sourceforge.io
GNU General Public License v2.0
81 stars 29 forks source link

Bug in the martials/cif.py parser #103

Closed muxlmunich closed 4 years ago

muxlmunich commented 4 years ago

Hello Community,

recently i tried to load a cif file with the CIFFile Function of xrayutilities. Unfortunately, this ends in an error message. (Error Code is bellow). The cif file is also added below. This Error indicates, that in the material/cif.py file the string f in this Codeline could not be founded.

After deleting systematically some parts in the cif file the error is gone by deleting the part wat1 ... to wat3 ... .

Please could you explain me, what i have to change in the cif file or in the parser?

The Code:


import xrayutilities as xru
from xrayutilities.materials.cif import CIFFile
from xrayutilities.materials.material import Crystal

xu_cif = CIFFile(filestr='Crystal.cif')

The Error Answer:

> /usr/local/lib/python3.5/dist-packages/xrayutilities/materials/cif.py in __init__(self, filestr, digits)
>     205         self._default_dataset = None
>     206         self.data = {}
> --> 207         self.Parse()
>     208 
>     209     def __del__(self):
> 
> /usr/local/lib/python3.5/dist-packages/xrayutilities/materials/cif.py in Parse(self)
>     231                 self.fid.seek(fidpos)
>     232                 name = line[m.end():].strip()
> --> 233                 self.data[name] = CIFDataset(self.fid, name, self.digits)
>     234                 if self.data[name].has_atoms and not self._default_dataset:
>     235                     self._default_dataset = name
> 
> /usr/local/lib/python3.5/dist-packages/xrayutilities/materials/cif.py in __init__(self, fid, name, digits)
>     293         if config.VERBOSITY >= config.INFO_ALL:
>     294             print('XU.material: parsing cif dataset %s' % self.name)
> --> 295         self.Parse(fid)
>     296         self.SymStruct()
>     297 
> 
> /usr/local/lib/python3.5/dist-packages/xrayutilities/materials/cif.py in Parse(self, fid)
>     475                 asplit = line.split()
>     476                 try:
> --> 477                     atom = get_element(asplit[alab_idx])
>     478                     apos = (floatconv(asplit[ax_idx]),
>     479                             floatconv(asplit[ay_idx]),
> 
> /usr/local/lib/python3.5/dist-packages/xrayutilities/materials/cif.py in get_element(cifstring)
>     328                     element = elements.dummy
>     329                 else:
> --> 330                     elname = el[:f.start()]
>     331                     if hasattr(elements, elname):
>     332                         # here one might want to find a closer alternative than
> 
> AttributeError: 'NoneType' object has no attribute 'start'

CIF File for Opening:

...
loop_
...
Wat1 0.01410 0.01380 0.01870 -0.00460 0.00080 0.00470
Wat2 0.01200 0.01160 0.03620 0.00120 -0.00080 -0.00140
Wat3 0.02060 0.01290 0.03850 0.00040 0.00220 0.00910
dkriegner commented 4 years ago

What version of xrayutilities are you using? I assume its not the latest since the Git master is not working with Python 3.5. it can however be that your error also appears on the latest version. In the part of the code you mention xrayutilities tries to identify the type of atom, i.e. the chemical element, mentioned in the cif file. Your atom 'Wat1' naturally can't be identified as a chemical element. This causes the error. For fictious/unknown atoms the code so far only recognises the '?'. I am not sure what the cif standard says about this but regarding an recommendation it would be important to know what you actually expect to get? In xrayutilities 'Crystal' objects are typically used to calculate optical constants which need real elements to yield physical values. So what would be your usecase?

muxlmunich commented 4 years ago

The used version is 1.6.0 The use case would be to calculate a XRD Pattern from the mentioned cif file. This describes a hydrated crystal so, i think, the Wat... is representative for the hydrate part of it. Maybe there is a workaround to replace the wat part.

dkriegner commented 4 years ago

Xrayutilities does not understand compound parts in cif files. It needs the position of every atom.

Is the position somehow encoded in a different part of the cif-file? The cif parser in xrayutilities is rather primitive. There are likely many valid cif files which can not be parsed.

I would need to see the full cif file to see if this can be fixed in xrayutilities.

On Tue, 11 Aug 2020, 16:14 muxlmunich, notifications@github.com wrote:

The used version is 1.6.0 The use case would be to calculate a XRD Pattern from the mentioned cif file. This describes a hydrated crystal so, i think, the Wat... is representative for the hydrate part of it. Maybe there is a workaround to replace the wat part.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/dkriegner/xrayutilities/issues/103#issuecomment-671972586, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACKZJFJGIV5NDE55254K4G3SAFG5DANCNFSM4P2CBCFA .

muxlmunich commented 4 years ago

The Full Ciffile is from the AMS Database Link

data_global
_amcsd_formula_title 'Ca(H2O)3(C2O4)'
loop_
_publ_author_name
'Deganello S'
'Kampf A R'
'Moore P B'
_journal_name_full 'American Mineralogist'
_journal_volume 66 
_journal_year 1981
_journal_page_first 859
_journal_page_last 865
_publ_section_title
;
 The crystal structure of calcium oxalate trihydrate Ca(H2O)3(C2O4)
;
_database_code_amcsd 0000844
_chemical_formula_sum 'Ca C2 O7 H12'
_cell_length_a 7.145
_cell_length_b 8.600
_cell_length_c 6.099
_cell_angle_alpha 112.30
_cell_angle_beta 108.87
_cell_angle_gamma 89.92
_cell_volume 324.935
_exptl_crystal_density_diffrn      1.923
_symmetry_space_group_name_H-M 'P -1'
loop_
_space_group_symop_operation_xyz
  'x,y,z'
  '-x,-y,-z'
loop_
_atom_site_label
_atom_site_fract_x
_atom_site_fract_y
_atom_site_fract_z
Ca   0.28100   0.30790   0.21610
C1  -0.40150   0.05010   0.10820
C2   0.42850   0.44390  -0.13310
O1  -0.41130   0.19360   0.26120
O2   0.24790   0.02250  -0.11090
O3   0.28460   0.35580  -0.14210
O4   0.46430   0.44840  -0.31980
Wat1   0.13700   0.12750   0.36080
Wat2   0.12400  -0.44390   0.36880
Wat3   0.08150  -0.24520   0.05500
H1   0.01400   0.43400   0.64600
H2   0.80300   0.34500   0.48700
H3   0.86500   0.32600   0.85300
H4   0.98900   0.10100   0.31300
H5   0.21100   0.16500   0.54000
H6   0.82300   0.13600   0.80800
loop_
_atom_site_aniso_label
_atom_site_aniso_U_11
_atom_site_aniso_U_22
_atom_site_aniso_U_33
_atom_site_aniso_U_12
_atom_site_aniso_U_13
_atom_site_aniso_U_23
Ca 0.01100 0.00580 0.01270 0.00020 0.00500 0.00260
C1 0.01240 0.01000 0.02970 -0.00110 0.00640 0.00250
C2 0.01050 0.00820 0.01840 -0.00080 0.00370 0.00380
O1 0.01380 0.00860 0.03340 0.00010 0.00490 -0.00440
O2 0.01190 0.01060 0.03420 0.00100 0.00600 -0.00100
O3 0.02050 0.01980 0.02020 -0.01090 0.00380 0.00610
O4 0.01530 0.01010 0.01570 -0.00260 0.00580 0.00420
Wat1 0.01410 0.01380 0.01870 -0.00460 0.00080 0.00470
Wat2 0.01200 0.01160 0.03620 0.00120 -0.00080 -0.00140
Wat3 0.02060 0.01290 0.03850 0.00040 0.00220 0.00910

Thank you for your help so far.

dkriegner commented 4 years ago

Looking at the cif file these entries stand for water molecules. At least it seems the chemical formula can be reproduced then.

I do not understand how this should work when parsing. Water is a molecule and since only one x,y,z coordinate is given one can not know the orientation of the molecule, right? I wonder how other software reads this file. Vesta for example for visualization... Either I am missing something or the information is somehow incomplete. Putting all three atoms of water to one and the same position seems not correct... Likely the hydrogen positions do not influence the structure factors much but it would be important for me to know how one should proceed in this case. In the cof standard i could not find info about this neither.

So in order to fix this in xrayutilities I need to know what one actually wants to get in such cases.

On Wed, 12 Aug 2020, 16:40 muxlmunich, notifications@github.com wrote:

The Full Ciffile is from the AMS Database Link http://rruff.geo.arizona.edu/AMS/authors/Deganello%20S

data_global _amcsd_formulatitle 'Ca(H2O)3(C2O4)' loop _publ_author_name 'Deganello S' 'Kampf A R' 'Moore P B' _journal_name_full 'American Mineralogist' _journal_volume 66 _journal_year 1981 _journal_page_first 859 _journal_page_last 865 _publ_section_title ; The crystal structure of calcium oxalate trihydrate Ca(H2O)3(C2O4) ; _database_code_amcsd 0000844 _chemical_formula_sum 'Ca C2 O7 H12' _cell_length_a 7.145 _cell_length_b 8.600 _cell_length_c 6.099 _cell_angle_alpha 112.30 _cell_angle_beta 108.87 _cell_angle_gamma 89.92 _cell_volume 324.935 _exptl_crystal_density_diffrn 1.923 _symmetry_space_group_nameH-M 'P -1' loop _space_group_symop_operationxyz 'x,y,z' '-x,-y,-z' loop _atom_site_label _atom_site_fract_x _atom_site_fract_y _atom_site_fractz Ca 0.28100 0.30790 0.21610 C1 -0.40150 0.05010 0.10820 C2 0.42850 0.44390 -0.13310 O1 -0.41130 0.19360 0.26120 O2 0.24790 0.02250 -0.11090 O3 0.28460 0.35580 -0.14210 O4 0.46430 0.44840 -0.31980 Wat1 0.13700 0.12750 0.36080 Wat2 0.12400 -0.44390 0.36880 Wat3 0.08150 -0.24520 0.05500 H1 0.01400 0.43400 0.64600 H2 0.80300 0.34500 0.48700 H3 0.86500 0.32600 0.85300 H4 0.98900 0.10100 0.31300 H5 0.21100 0.16500 0.54000 H6 0.82300 0.13600 0.80800 loop _atom_site_aniso_label _atom_site_aniso_U_11 _atom_site_aniso_U_22 _atom_site_aniso_U_33 _atom_site_aniso_U_12 _atom_site_aniso_U_13 _atom_site_aniso_U_23 Ca 0.01100 0.00580 0.01270 0.00020 0.00500 0.00260 C1 0.01240 0.01000 0.02970 -0.00110 0.00640 0.00250 C2 0.01050 0.00820 0.01840 -0.00080 0.00370 0.00380 O1 0.01380 0.00860 0.03340 0.00010 0.00490 -0.00440 O2 0.01190 0.01060 0.03420 0.00100 0.00600 -0.00100 O3 0.02050 0.01980 0.02020 -0.01090 0.00380 0.00610 O4 0.01530 0.01010 0.01570 -0.00260 0.00580 0.00420 Wat1 0.01410 0.01380 0.01870 -0.00460 0.00080 0.00470 Wat2 0.01200 0.01160 0.03620 0.00120 -0.00080 -0.00140 Wat3 0.02060 0.01290 0.03850 0.00040 0.00220 0.00910

Thank you for your help so far.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/dkriegner/xrayutilities/issues/103#issuecomment-672912583, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACKZJFLE2T5WT7PILDJM4NDSAKSVTANCNFSM4P2CBCFA .

muxlmunich commented 4 years ago

Thank you for your answer and detailed explanation. So in my particular case i found another cif file which describes the position of the hydrogen and the oxygen. Maybe a short message would be helpful here, like "the Element" is not known/precise enough.

This could be achieved by: if f is none: print("The Element %s could not be interpreted. Please precise the cif File or Element." % element) in [Line] (https://github.com/dkriegner/xrayutilities/blob/c02ded2b7cfeac69ad4a103d97cb804b5b81419c/lib/xrayutilities/materials/cif.py#L230)