openbabel / openbabel

Open Babel is a chemical toolbox designed to speak the many languages of chemical data.
http://openbabel.org/
GNU General Public License v2.0
1.08k stars 414 forks source link

CIF: behaviour inconsistency between Hermann–Mauguin short and full notations if symmetry operations are insufficient #2586

Open e-kwsm opened 1 year ago

e-kwsm commented 1 year ago

Environment Information

Open Babel version: 2b211d6acfc0f9e1c6746a984da3483b729aa32b Operating system and version: EndeavourOS

Expected Behavior

CIF has a Hermann–Mauguin symbol entry, space_group_name_H-M_alt:

_space_group_name_H-M_alt allows any Hermann-Mauguin symbol to be given. The way in which this item is used is determined by the user and in general is not intended to be interpreted by computer. It may, for example, be used to give one of the extended Hermann-Mauguin symbols given in Table 4.3.2.1 of International Tables for Crystallography Vol. A (2002) or a Hermann-Mauguin symbol for a conventional or unconventional setting.

(emphasis mine)

Usually short forms of HM symbols are used instead of full ones, e.g., #­225 F m -3 m vs F 4/m -3 2/m.

CIF also has symmetry operation entry, _space_group_symop_operation_xyz:

When a list of symmetry operations is given, it must contain a complete set of coordinate representatives which generates all the operations of the space group by the addition of all primitive translations of the space group. Such representatives are to be found as the coordinates of the general-equivalent position in International Tables for Crystallography Vol. A (2002), to which it is necessary to add any centring translations shown above the general-equivalent position.

That is to say, it is necessary to list explicitly all the symmetry operations required to generate all the atoms in the unit cell defined by the setting used.

(emphasis mine)

obabel behaviour must be independent from the Hermann–Mauguin notation form and the symmetry operartions.

Actual Behavior

The geometry is converted to P 1 when the symmetry operations are insufficient, but no messages are issued for the full HM symbol:

space_group_name_H-M_alt in input CIF _space_group_symop_operation_xyz in input CIF error messages _space_group_name_H-M_alt in generated CIF
F m -3 m (short) null N F m 3 m
F m -3 m (short) insufficient Y P 1
F m -3 m (short) all N F m 3 m
F 4/m -3 2/m (full) null N F m 3 m
F 4/m -3 2/m (full) insufficient N P 1
F 4/m -3 2/m (full) all N F m 3 m

Steps to Reproduce

Prepare the following file named as NaCl.cif:

data_NaCl
_cell_length_a 5.6
_cell_length_b 5.6
_cell_length_c 5.6
_cell_angle_alpha 90.0
_cell_angle_beta  90.0
_cell_angle_gamma 90.0
_chemical_name_common NaCl
_space_group_IT_number 225
_space_group_name_H-M_alt 'F m -3 m'
loop_
    _space_group_symop_operation_xyz
    x,y,z
loop_
    _atom_site_type_symbol
    _atom_site_fract_x
    _atom_site_fract_y
    _atom_site_fract_z
    Na  0.0  0.0  0.0
    Na  0.5  0.5  0.0
    Na  0.5  0.0  0.5
    Na  0.0  0.5  0.5
    Cl  0.5  0.0  0.0
    Cl  0.0  0.5  0.0
    Cl  0.0  0.0  0.5
    Cl  0.5  0.5  0.5

Here, _space_group_symop_operation_xyz is apparently insufficient.

all of the operations for Fm-3m
    x,y,z
    -x,-y,z
    -x,y,-z
    x,-y,-z
    z,x,y
    z,-x,-y
    -z,-x,y
    -z,x,-y
    y,z,x
    -y,z,-x
    y,-z,-x
    -y,-z,x
    y,x,-z
    -y,-x,-z
    y,-x,z
    -y,x,z
    x,z,-y
    -x,z,y
    -x,-z,-y
    x,-z,y
    z,y,-x
    z,-y,x
    -z,y,x
    -z,-y,-x
    -x,-y,-z
    x,y,-z
    x,-y,z
    -x,y,z
    -z,-x,-y
    -z,x,y
    z,x,-y
    z,-x,y
    -y,-z,-x
    y,-z,x
    -y,z,x
    y,z,-x
    -y,-x,z
    y,x,z
    -y,x,-z
    y,-x,-z
    -x,-z,y
    x,-z,-y
    x,z,y
    -x,z,-y
    -z,-y,x
    -z,y,-x
    z,-y,-x
    z,y,x
    x,1/2+y,1/2+z
    -x,1/2-y,1/2+z
    -x,1/2+y,1/2-z
    x,1/2-y,1/2-z
    z,1/2+x,1/2+y
    z,1/2-x,1/2-y
    -z,1/2-x,1/2+y
    -z,1/2+x,1/2-y
    y,1/2+z,1/2+x
    -y,1/2+z,1/2-x
    y,1/2-z,1/2-x
    -y,1/2-z,1/2+x
    y,1/2+x,1/2-z
    -y,1/2-x,1/2-z
    y,1/2-x,1/2+z
    -y,1/2+x,1/2+z
    x,1/2+z,1/2-y
    -x,1/2+z,1/2+y
    -x,1/2-z,1/2-y
    x,1/2-z,1/2+y
    z,1/2+y,1/2-x
    z,1/2-y,1/2+x
    -z,1/2+y,1/2+x
    -z,1/2-y,1/2-x
    -x,1/2-y,1/2-z
    x,1/2+y,1/2-z
    x,1/2-y,1/2+z
    -x,1/2+y,1/2+z
    -z,1/2-x,1/2-y
    -z,1/2+x,1/2+y
    z,1/2+x,1/2-y
    z,1/2-x,1/2+y
    -y,1/2-z,1/2-x
    y,1/2-z,1/2+x
    -y,1/2+z,1/2+x
    y,1/2+z,1/2-x
    -y,1/2-x,1/2+z
    y,1/2+x,1/2+z
    -y,1/2+x,1/2-z
    y,1/2-x,1/2-z
    -x,1/2-z,1/2+y
    x,1/2-z,1/2-y
    x,1/2+z,1/2+y
    -x,1/2+z,1/2-y
    -z,1/2-y,1/2+x
    -z,1/2+y,1/2-x
    z,1/2-y,1/2-x
    z,1/2+y,1/2+x
    1/2+x,y,1/2+z
    1/2-x,-y,1/2+z
    1/2-x,y,1/2-z
    1/2+x,-y,1/2-z
    1/2+z,x,1/2+y
    1/2+z,-x,1/2-y
    1/2-z,-x,1/2+y
    1/2-z,x,1/2-y
    1/2+y,z,1/2+x
    1/2-y,z,1/2-x
    1/2+y,-z,1/2-x
    1/2-y,-z,1/2+x
    1/2+y,x,1/2-z
    1/2-y,-x,1/2-z
    1/2+y,-x,1/2+z
    1/2-y,x,1/2+z
    1/2+x,z,1/2-y
    1/2-x,z,1/2+y
    1/2-x,-z,1/2-y
    1/2+x,-z,1/2+y
    1/2+z,y,1/2-x
    1/2+z,-y,1/2+x
    1/2-z,y,1/2+x
    1/2-z,-y,1/2-x
    1/2-x,-y,1/2-z
    1/2+x,y,1/2-z
    1/2+x,-y,1/2+z
    1/2-x,y,1/2+z
    1/2-z,-x,1/2-y
    1/2-z,x,1/2+y
    1/2+z,x,1/2-y
    1/2+z,-x,1/2+y
    1/2-y,-z,1/2-x
    1/2+y,-z,1/2+x
    1/2-y,z,1/2+x
    1/2+y,z,1/2-x
    1/2-y,-x,1/2+z
    1/2+y,x,1/2+z
    1/2-y,x,1/2-z
    1/2+y,-x,1/2-z
    1/2-x,-z,1/2+y
    1/2+x,-z,1/2-y
    1/2+x,z,1/2+y
    1/2-x,z,1/2-y
    1/2-z,-y,1/2+x
    1/2-z,y,1/2-x
    1/2+z,-y,1/2-x
    1/2+z,y,1/2+x
    1/2+x,1/2+y,z
    1/2-x,1/2-y,z
    1/2-x,1/2+y,-z
    1/2+x,1/2-y,-z
    1/2+z,1/2+x,y
    1/2+z,1/2-x,-y
    1/2-z,1/2-x,y
    1/2-z,1/2+x,-y
    1/2+y,1/2+z,x
    1/2-y,1/2+z,-x
    1/2+y,1/2-z,-x
    1/2-y,1/2-z,x
    1/2+y,1/2+x,-z
    1/2-y,1/2-x,-z
    1/2+y,1/2-x,z
    1/2-y,1/2+x,z
    1/2+x,1/2+z,-y
    1/2-x,1/2+z,y
    1/2-x,1/2-z,-y
    1/2+x,1/2-z,y
    1/2+z,1/2+y,-x
    1/2+z,1/2-y,x
    1/2-z,1/2+y,x
    1/2-z,1/2-y,-x
    1/2-x,1/2-y,-z
    1/2+x,1/2+y,-z
    1/2+x,1/2-y,z
    1/2-x,1/2+y,z
    1/2-z,1/2-x,-y
    1/2-z,1/2+x,y
    1/2+z,1/2+x,-y
    1/2+z,1/2-x,y
    1/2-y,1/2-z,-x
    1/2+y,1/2-z,x
    1/2-y,1/2+z,x
    1/2+y,1/2+z,-x
    1/2-y,1/2-x,z
    1/2+y,1/2+x,z
    1/2-y,1/2+x,-z
    1/2+y,1/2-x,-z
    1/2-x,1/2-z,y
    1/2+x,1/2-z,-y
    1/2+x,1/2+z,y
    1/2-x,1/2+z,-y
    1/2-z,1/2-y,x
    1/2-z,1/2+y,-x
    1/2+z,1/2-y,-x
    1/2+z,1/2+y,x

Then

$ obabel NaCl.cif -o cif > /dev/null
==============================
*** Open Babel Error  in Find
  Unknown space group error (H-M symbol:F m -3 m), cannot match the list of transforms, please file a bug report.
==============================
*** Open Babel Warning  in Do
  Converting to P 1 cell using available symmetry transformations.
1 molecule converted

If F m -3 m is replaced with F 4/m -3 2/m, the messages are gone

$ obabel NaCl.cif -o cif > /dev/null
1 molecule converted

But the spacegroup of the output is P 1.

e-kwsm commented 1 year ago

Is F 4/m -3 2/m invalid and silently converted to P 1 in Open Babel?

husakm commented 6 months ago

I am afraid you must go to the source code of the CIF import in OB and investigate yourself.

nbehrnd commented 6 months ago

@e-kwsm By the lattice vectors (a,b,c) -- accounting both for lengths and enclosed angles \alpha, \beta, \gamma -- the model is in the cubic class, i.e. a = b = c and \alpha = \beta = \gamma. The single x, y, z in the loop of symmetry operators were suitable for the monoclinic space group P1, but not sufficient to describe higher symmetry.

For a contemporary crystallographic model about NaCl in symmetry F m -3 m and lattice vectors not too far away from your example, see the public .cif file attached below about COD 2108652. The loop in question starts on line 106 and ends with line 229. With this in hand, obabel (version 3.1.1 as packaged by Linux Debian 13/trixie) does not report a problem, i.e.

$ obabel 2108652.cif -ocif > /dev/null
1 molecule converted

What was the source of the .cif file you used in your example? Curators of ICSD, CCDC/CSD, COD, etc. usually welcome if users' reports allow to improve their database further.

2108652.cif.zip

nbehrnd commented 6 months ago

@e-kwsm An addition: for NaCl, this page by TU Graz/Austria compiles all 192 symmetry operators of space group 225.

husakm commented 6 months ago

OB use an database of HM symbols. This database cover only a subset of possible symbols (all standard, some non standard). It does not for sure cover all possible symbols combination. You can find this database in OB source code. I am not sure OB is able to ignore the HM symbols and interpret the symetry operations list (this should be the correct behaviour) ... Anyway I suggest to study crystallography and hava a look on IUCR symmetry operaton tables book ....