COMCIFS / Powder_Dictionary

CIF definitions for powder diffraction
4 stars 4 forks source link

update the QPA categories #89

Closed rowlesmr closed 1 year ago

rowlesmr commented 1 year ago

I started writing up how to report QPA, and realised I had overthought things.

Key changes:

.

Possible other considerations for the future:

jamesrhester commented 1 year ago

Something that we endeavour to do in CIF is to make sure that the way in which a value is used is not changed by the values of other data names. An extreme example would be a data name called _pd.value and another data name _pd.flag with alternative values like absorption, monitor or version, with the interpretation of _pd.value depending on the value of _pd.flag - it could be the absorption coefficient, the monitor counts to scale to, or the version of the software used.

What is OK is for the derivation of the value to be different: so _refln.F_meas for powder is derived in a completely different way to _refln.F_meas for single crystal, and that is OK, as the values are used in the same way e.g. to create a difference density map.

So I'm concerned that the PD_QPA_CALIB_FACTOR category has one of these magic values, but if the data are treated in identical fashion regardless of the origin of the value then it is not a problem. Can you explain @rowlesmr ?

rowlesmr commented 1 year ago

Probably not directly answering your question, but I'm addressing what I'm trying to do.

As you say here, I'd like to include information that allows the quantification to be rederived from values in the collection.

Looking at ZMV formulism, this requires the scale factor. This isn't currently recorded anywhere, and depends on phase, diffractogram, and the exact analysis program and version used. In the above nomencalture, the inverse of a phase's ZMV is _pd_qpa_calib_factor.value, the scale factor is _pd_qpa_intensity_factor.value, and the algorithm is given by _pd_qpa_overall.method.

For RIR, this requires a peak intensity, sum of peak intensities, or a scale-type factor. Again, this isn't currently recorded anywhere, and depends on phase, diffractogram, and the exact analysis program and version used. In the above nomencalture, a phase's RIR is _pd_qpa_calib_factor.value, the peak intensity is _pd_qpa_intensity_factor.value, and the algorithm is given by _pd_qpa_overall.method.

For PONCKS, this is a little different. The peak intensities assigned to the phase, when combined with a 'synthetic' ZMV-like value, and a scale factor can give you quant. In this case, the peak intensities have a meaningful value only when combined with the synthetic ZMV - but they aren't F_squared_meas values, or counts, or intensities. They can also exist as reflections or arbitrary peaks. In the above nomencalture, the inverse of a phase's 'synthetic' ZMV is _pd_qpa_calib_factor.value, the scale factor is _pd_qpa_intensity_factor.value, and the algorithm is given by _pd_qpa_overall.method.

The same goes with absorption-diffraction; the calibration value depends on the exact intensities used, and so we need to ensure that the properly calibrated intensities are either recorded, or some _pd_qpa_intensity_factor.value-like data item can capture the actual value used in the quantification.

.

It is trivial to make _pd_qpa_calib_factor.____ data items for the various methods, which then breaks the link on how a data item is interpreted. I think that _pd_qpa_intensity_factor.value can be defined in such a way that is always treated the samy way with respect to any _pd_qpa_calib_factor.____ data item. This just leaves _pd_qpa_overall.method to define how to do the calculation.

rowlesmr commented 1 year ago

So I'm concerned that the PD_QPA_CALIB_FACTOR category has one of these magic values, but if the data are treated in identical fashion regardless of the origin of the value then it is not a problem. Can you explain @rowlesmr ?

Here's a summary of how it currently works in this PR:

Categories and data items

PD_QPA_CALIB_FACTOR. (Set , keyed on _pd_qpa_calib_factor.phase_id). The relevant data item is _pd_qpa_calib_factor.value. (The .value acts like the RIR value.)

PD_QPA_INTENSITY_FACTOR (Loop, keyed on _pd_qpa_intensity_factor.phase_id and _pd_qpa_intensity_factor.diffractogram_id). The relevant data item is _pd_qpa_intensity_factor.value. The .value acts like the peak intensity, or the Rietveld scale factor; the 'thing' which is acted on by the calibration factor.

PD_QPA_OVERALL (Set, keyed on _pd_qpa_overall.diffractogram_id). The relevant data item is _pd_qpa_overall.method.

How it's supposed to work

A given diffractogram is marked as being quantified by having _pd_phase_mass.percent values. This is enough to say it has been quantified, but gives no indication as to how this was done. A value can be assigned to _pd_qpa_overall.method, saying how it was quantified. The enumeration was taken from §3.9, Vol H.

The various methods are:

where W is the weight fraction, p represents the pth phase, I or S is the intensity or scale factor used to quantify that phase, P is the total number of phases, and μ*m is the mass absorption coefficient of the entire specimen. Cp is the calibration factor which puts the intensities/scale factors of the constituent phases onto a common scale to allow for quantification.

Ip and Sp would be recorded using _pd_qpa_intensity_factor.value.

The definition of Cp changes, depending on the _pd_qpa_overall.method.

The various defintions are:

Conclusions

In all cases, _pd_qpa_intensity_factor.value is divided by _pd_qpa_calib_factor.value. This result is then treated in varying ways according to the QPA method in order to arrive at the final quant answer.

If you want _pd_qpa_calib_factor.value to have a unique definition (ie as given in the second list, which I think is what you're after), then we'll need one data item per method. If you want _pd_qpa_calib_factor.value to have a unique way of being used, then I've shown that that is the case.

Epilogue

But when doing quant on a diffractogram, the _pd_qpa_calib_factor.value for each phase must be of the same type (except that you can mix ZMV and PONKCS)

So, after writing all of that out, I think that there should be many data items in PD_QPA_CALIB_FACTOR, each one corresponding to each QPA method. _pd_qpa_overall.method still informs the user as to how the QPA was done, and how to combine the values, but not what the individual values mean (hopefully, you can't confuse, for example, RIR values with PONKCS values). I still think that a single _pd_qpa_intensity_factor.value data item is OK, as this value will be quite dependent on the analysis program used and what normalisation constants and such are used. As long as they are consistent, you're OK, but transfering values between TOPAS, GSASII, FullProf, Rietan... will end in tears, if trying to directly recalculate the fit.

jamesrhester commented 1 year ago

I still think that a single _pd_qpa_intensity_factor.value data item is OK, as this value will be quite dependent on the analysis program used and what normalisation constants and such are used. As long as they are consistent, you're OK, but transfering values between TOPAS, GSASII, FullProf, Rietan... will end in tears, if trying to directly recalculate the fit

I like your analysis above, which shows very clearly that the values are used in the same way, even if they are derived in different way, and so I agree that a single data name _pd_qpa_intensity_factor.value is appropriate.

jamesrhester commented 1 year ago

Happy for this to be merged after that single query on "CIF container" is sorted.