Open drlemmus opened 5 years ago
When first written, this list was not intended to be comprehensive.
a) The 'X-ray 2FOFC map coefficients' are clearly redundant and will be removed. b) FOM, X-ray batch: You are correct - they do not make sense. Should be removed c) With regards to sigmas - presumably you would not have a data block of inky sigmas without F's or I's. You should think of this content_type to describe the type of data in such a block - but you would need to examine the columns present to know if this was present. d) pdbx_diffrn_data_section_contents is designed to be a table of contents. Imagine if you have a joint X-ray/Neutron - and the contents said "map coefficients" - you would not know which was X-ray and which was Neutron.
One possibility is to include in pdbx_diffrn_data_section a mandatory scattering type. Then your content type enumeration is reduced - and you still can find out what is in the file without having to parse each data block.
I would enjoy other's thoughts on this.
I think adding the scattering type would be a good solution.
Some comments: a) maybe using the fully spelt-out "coefficients" instead of "coeff" would be nicer? We don't need to save a few characters when it means potential confusion, right? b) this looks like a missing LF/CR, i.e. these are two different items ("X-Ray FOM" and "X-ray batch flag from mtz"). "X-Ray FOM" should be kept I think. "X-ray batch flag from mtz" could be kept (since it has a mostly historic meaning in MTZ format content), but I'm not sure it helps anything with modern data collections (since we have a 1:1 correspondence between BATCH and image number).
Additional requests:
(1) Use consistently plural:
(2) Add item for merged intensities:
(3) Use consistent conventions to mark data as "unmerged":
(4) Add items for anomalous differences:
(5) Clarify the distinction between "merged" and "unmerged" data in the definition. Something like
The value of _pdbx_diffrn_data_section_contents.content_type describes the type of reflection data a data section in a diffraction data file holds. Multiple types can be associated to a given data section. Data sections are supposed to contain merged reflection data (i.e. reduced to a reciprocal space asymmetric unit) by default - unless explicitly described as being "unmerged"
What is more, the current definition is plain wrong:
The value of _pdbx_diffrn_data_section_contents.content_type uniquely identifies
a data section in a diffraction data file.
This is the description of _pdbx_diffrn_data_section_contents.data_section_id with the name of the item changed.
The allowed values for _pdbx_diffrn_data_section_contents.content_type have several values that seem to be duplicates. 'X-ray 2FOFC map coefficients' and 'X-ray 2FO-FC map coeff' seem to be the same thing and so do 'X-ray structure factor intensities, unmerged' and 'X-ray unmerged intensities'. If there are subltle differences, these should be made very clear, if not only one value should remain. The value 'X-ray FOM, X-ray batch flag from mtz' is a bit weird as FOM and batch flag are independent values.
The set also seems to be rather incomplete for neutron and electron scattering and for X-ray data there are still quite a few values (sigmas for instance) that can occur on a data section that are not listed.
Since the experiment type for a data_section_id is already described in _pdbx_diffrn_data_section.id as is the status of the data with respect to being merged, it seems superfluous to encode these in _pdbx_diffrn_data_section_contents.content_type values. Not doing so can solve some of the issues above and also keep the list of possible values reasonably small.