Create directory of example CIFs

rowlesmr commented 1 year ago

From https://github.com/COMCIFS/cif_core/pull/430#issuecomment-1605442143

Full CIF file examples of various concepts.

Initial commit for QPA by external standard.

rowlesmr commented 1 year ago

From @vaitkus

As a side-note, I noticed that some items from the powder dictionary have values that violate the current restrictions imposed by that dictionary. I will give a few examples here of the offending items, but I think we should raise a PR with this example file in the powder dictionary repository and continue the full discussion there. Note, I do not claim that the values are bad, just that they do not fit with the current dictionary definitions which may be too limiting or incorrect:

The DIFFRN_RADIATION_WAVELENGTH loop has 4 item and 3 value columns (the _diffrn_radiation_wavelength.type values are missing).

The value of _exptl_absorpt_coefficient_mu is often given with standard uncertainties (using the parenthesis notations, e.g. 17.7460(14)), although in the dictionary this item is currently defined as a Number and thus cannot have standard uncertainties. However, I find it a bit strange, since according to the human-readable definition it is calculated from other measurand items and could thus potentially have an SU. Any thoughts on this?

Data item _pd_meas.scan_method value 'scan' it not one of the currently known enumeration values for this item. Should it be included?

Data item _pd_calc.component_intensities_total does not seems to be currently defined in the dictionary.

rowlesmr commented 1 year ago

The DIFFRN_RADIATION_WAVELENGTH loop has 4 item and 3 value columns

I just forgot to put the values back. I was copying from a Cu loop, and this is a Co loop.

Data item _pd_meas.scan_method value 'scan'

It should have been 'cont'.

Data item _pd_calc.component_intensities_total does not seems to be currently defined in the dictionary

It's currently hiding in a PR (#155)

The value of _exptl_absorpt_coefficient_mu

This should definately have an SU associated with it. All three of the things mentioned in the description are able to be refined. _exptl_crystal.density_diffrn, _atom_site.occupancy, and _diffrn_radiation_wavelength.value are all Measurand. I can do a PR on this.

vaitkus commented 1 year ago

Could the directory be renamed from Examples to examples to match the name in other dictionary repositories?

vaitkus commented 1 year ago

Ok, I have a few more technical questions about the example:

Data block DIFFRACTOGRAM_0020 contains a loop with the _pd_meas.2theta_scan, _pd_proc.ls_weight and other data items, but not point id data item, e.g. _pd_meas.point_id. Is this allowed?
Some loops contain data items from PR_PROC, PD_CALC and PD_MEAS categories. However, I guess that this is technically allowed since they are all children of a looped PD_DATA category? I might need to slightly update the validator cause currently it only allows parent-child combined loops, but not the sibling-combined loops.
The _pd_qpa_external_std.diffractogram_id data item is linked to the _pd_diffractogram.id data item. In the most basic case this means that _pd_qpa_external_std.diffractogram_id data item will have a value that matched one of the values of the _pd_diffractogram.id data item in the same data block. However, since these are powder diffraction files, I guess that multi block interpretation starts being applied here and the _pd_diffractogram.id values are checked across all data blocks? Should this be somehow marked in the example file, e.g. by setting the appropriate _audit.schema value or including items from the AUDIT_CONFORM category)?

rowlesmr commented 1 year ago

Data block DIFFRACTOGRAM_0020 contains a loop with the _pd_meas.2theta_scan, _pd_proc.ls_weight and other data items, but not point id data item, e.g. _pd_meas.point_id. Is this allowed?

Technically no. I could have sworn I added it in... I'll add it in, my auto-TOPAS output doesn't include a point id.

Some loops contain data items from PR_PROC, PD_CALC and PD_MEAS categories. However, I guess that this is technically allowed since they are all children of a looped PD_DATA category?

I'm assuming this is

loop_
    _pd_meas.2theta_scan
    _pd_meas.counts_total
    _pd_proc.ls_weight
    _pd_calc.intensity_total
    _pd_proc.intensity_bkg_calc
    _pd_calc.component_intensities_total

Yes, this is the intent behind having PR_PROC, PD_CALC, and PD_MEAS all as children of PD_DATA. If you have some combination of measured, processed, and/or calculated data where there is a one-to-one correspondence between data points (ie the same _pd_data.point_id value), then it makes sense (and saves space) in putting them together in the same loop. Each row describes the same point.

I guess that multi block interpretation starts being applied here and the _pd_diffractogram.id values are checked across all data blocks?

Yes. Powder experiments are almost always going to be multi-block, and looking for id-values is kind of governed by https://github.com/COMCIFS/comcifs.github.io/blob/master/accepted/multi-block-principles.md; I've never actually used CIF in a neat one-block-is-one-experiment/structure way. I don't know how to properly denote that using _audit* data items. This also brushes up against in https://github.com/COMCIFS/comcifs.github.io/blob/master/draft/block_collections.md.

What I am wanting to say with _pd_qpa_external_std.diffractogram_id SRM676A is "When I quantified the current diffractogram (DIFFRACTOGRAM_0020), I used the information from the diffractogram identified as SRM676A. When I go there, I find values of k_factor and MAC which I use in my calculations."

Is there a better way to say that? Maybe _pd_qpa_external_std.ref_diffractogram_id, which is Encode, and not Link?

rowlesmr commented 1 year ago

If you want to have a look at some other pdCIFs I made (before I really knew what I was doing), check out https://journals.iucr.org/j/issues/2022/03/00/yr5087/

rowlesmr commented 1 year ago

Data block DIFFRACTOGRAM_0020 contains a loop with the _pd_meas.2theta_scan, _pd_proc.ls_weight and other data items, but not point id data item, e.g. _pd_meas.point_id. Is this allowed?

You could make a rule that if there is one loop in a block with one diff_id, then you could autogenerate the point ids. But making too many exceptions isn't really a good thing.

COMCIFS / Powder_Dictionary

Create directory of example CIFs #161