Open rowlesmr opened 1 year ago
From @vaitkus
As a side-note, I noticed that some items from the powder dictionary have values that violate the current restrictions imposed by that dictionary. I will give a few examples here of the offending items, but I think we should raise a PR with this example file in the powder dictionary repository and continue the full discussion there. Note, I do not claim that the values are bad, just that they do not fit with the current dictionary definitions which may be too limiting or incorrect:
- The
DIFFRN_RADIATION_WAVELENGTH
loop has 4 item and 3 value columns (the_diffrn_radiation_wavelength.type
values are missing).- The value of
_exptl_absorpt_coefficient_mu
is often given with standard uncertainties (using the parenthesis notations, e.g. 17.7460(14)), although in the dictionary this item is currently defined as aNumber
and thus cannot have standard uncertainties. However, I find it a bit strange, since according to the human-readable definition it is calculated from other measurand items and could thus potentially have an SU. Any thoughts on this?- Data item
_pd_meas.scan_method
value 'scan' it not one of the currently known enumeration values for this item. Should it be included?- Data item
_pd_calc.component_intensities_total
does not seems to be currently defined in the dictionary.
The DIFFRN_RADIATION_WAVELENGTH loop has 4 item and 3 value columns
I just forgot to put the values back. I was copying from a Cu loop, and this is a Co loop.
Data item
_pd_meas.scan_method
value 'scan'
It should have been 'cont'.
Data item
_pd_calc.component_intensities_total
does not seems to be currently defined in the dictionary
It's currently hiding in a PR (#155)
The value of
_exptl_absorpt_coefficient_mu
This should definately have an SU associated with it. All three of the things mentioned in the description are able to be refined. _exptl_crystal.density_diffrn
, _atom_site.occupancy
, and _diffrn_radiation_wavelength.value
are all Measurand
. I can do a PR on this.
Could the directory be renamed from Examples
to examples
to match the name in other dictionary repositories?
Ok, I have a few more technical questions about the example:
DIFFRACTOGRAM_0020
contains a loop with the _pd_meas.2theta_scan
, _pd_proc.ls_weight
and other data items, but not point id data item, e.g. _pd_meas.point_id
. Is this allowed?PR_PROC
, PD_CALC
and PD_MEAS
categories. However, I guess that this is technically allowed since they are all children of a looped PD_DATA
category? I might need to slightly update the validator cause currently it only allows parent-child combined loops, but not the sibling-combined loops._pd_qpa_external_std.diffractogram_id
data item is linked to the _pd_diffractogram.id
data item. In the most basic case this means that _pd_qpa_external_std.diffractogram_id
data item will have a value that matched one of the values of the _pd_diffractogram.id
data item in the same data block. However, since these are powder diffraction files, I guess that multi block interpretation starts being applied here and the _pd_diffractogram.id values are checked across all data blocks? Should this be somehow marked in the example file, e.g. by setting the appropriate _audit.schema
value or including items from the AUDIT_CONFORM
category)?Data block DIFFRACTOGRAM_0020 contains a loop with the _pd_meas.2theta_scan, _pd_proc.ls_weight and other data items, but not point id data item, e.g. _pd_meas.point_id. Is this allowed?
Technically no. I could have sworn I added it in... I'll add it in, my auto-TOPAS output doesn't include a point id.
Some loops contain data items from PR_PROC, PD_CALC and PD_MEAS categories. However, I guess that this is technically allowed since they are all children of a looped PD_DATA category?
I'm assuming this is
loop_
_pd_meas.2theta_scan
_pd_meas.counts_total
_pd_proc.ls_weight
_pd_calc.intensity_total
_pd_proc.intensity_bkg_calc
_pd_calc.component_intensities_total
Yes, this is the intent behind having PR_PROC
, PD_CALC
, and PD_MEAS
all as children of PD_DATA
. If you have some combination of measured, processed, and/or calculated data where there is a one-to-one correspondence between data points (ie the same _pd_data.point_id value), then it makes sense (and saves space) in putting them together in the same loop. Each row describes the same point.
I guess that multi block interpretation starts being applied here and the _pd_diffractogram.id values are checked across all data blocks?
Yes. Powder experiments are almost always going to be multi-block, and looking for id-values is kind of governed by https://github.com/COMCIFS/comcifs.github.io/blob/master/accepted/multi-block-principles.md; I've never actually used CIF in a neat one-block-is-one-experiment/structure way. I don't know how to properly denote that using _audit*
data items. This also brushes up against in https://github.com/COMCIFS/comcifs.github.io/blob/master/draft/block_collections.md.
What I am wanting to say with _pd_qpa_external_std.diffractogram_id SRM676A
is "When I quantified the current diffractogram (DIFFRACTOGRAM_0020), I used the information from the diffractogram identified as SRM676A. When I go there, I find values of k_factor and MAC which I use in my calculations."
Is there a better way to say that? Maybe _pd_qpa_external_std.ref_diffractogram_id
, which is Encode
, and not Link
?
If you want to have a look at some other pdCIFs I made (before I really knew what I was doing), check out https://journals.iucr.org/j/issues/2022/03/00/yr5087/
- Data block
DIFFRACTOGRAM_0020
contains a loop with the_pd_meas.2theta_scan
,_pd_proc.ls_weight
and other data items, but not point id data item, e.g._pd_meas.point_id
. Is this allowed?
You could make a rule that if there is one loop in a block with one diff_id, then you could autogenerate the point ids. But making too many exceptions isn't really a good thing.
From https://github.com/COMCIFS/cif_core/pull/430#issuecomment-1605442143
Full CIF file examples of various concepts.
Initial commit for QPA by external standard.