molovol / MoloVol

MoloVol is a free, cross-plattform, scientific software for volume and surface computations of single molecules and crystallographic unit cells.
https://molovol.com
MIT License
22 stars 4 forks source link

Recognize variants of space groups #100

Closed rlavendomme closed 3 years ago

rlavendomme commented 3 years ago

Current status

When analyzing unit cells, the program will read the space group of the input structure in H-M notation from PDB files then compare with a list of space groups provided with the program in space_groups.txt.

Issue

The current list of space groups (230) only includes one set of symmetry elements per space groups but there are variants of space groups using different settings and having different symmetry elements (for a total of at least 530 sets of symmetry elements, see http://cci.lbl.gov/sginfo/hall_symbols.html ). It was decided originally decided not to include variants that are deemed unlikely to be found in crystal structures by convention. Unfortunately, some variants that were not included are effectively used.

While most variants can be distinguished from H-M notations in PDB files and can technically be added to the list used in MoloVol, some cannot be distinguished: the space groups 48,50,59,68,70,85,86,88,125,126,129,130,133,134,137,138,141,142,201,203,222,224,227,228 can have two origin settings with no variation in their H-M notation.

This problem is more related to the PDB file format which has incomplete crystallographic information than to MoloVol itself. The same problem arises with other softwares such as Mercury that are designed for crystallography by experts.

Proposed change

The problematic space group variants could be added to the space_groups.txt file with a custom H-M notation. Since the input PDB file will not contain the required information to distinguish between both variants, this information will have to be provided by the user with an option in the UI.

This solution will cause another issue: I expect that most users will not know about the variants of space groups and will not be able to decide the right option. If one of the troublesome space groups is detected, a message should prompt the user to check the output structure and, if necessary, change the option to use the variant.

Improvement

Allow more crystal structures to be analyzed by the program.

jmaglic commented 3 years ago

I will comment on your behalf for posteriority: As we have learned, adding an option for the user to distinguish between space group variants is impractical because close to no users would understand the option. The only real solution is to add support for a file type that handles problematic space groups better than PDB files. You have decided to add CIF file support.