COMCIFS / Powder_Dictionary

CIF definitions for powder diffraction
4 stars 4 forks source link

Create a new _units.code for pdCIF intensities #25

Closed jamesrhester closed 1 year ago

jamesrhester commented 1 year ago

pdCIF distinguishes between "counts" (for which the square root gives the SU) and "intensities" (where some sort of scaling/merging/processing has been done, or the observations are not counts). A pdCIF data name is even available to separately provide the "units of intensity". DDLm, however, would like to attach a _units.code value to each definition that includes intensities. As far as I can tell, something like "intensity" might actually work as the _units.code value - as this conveys that the value is proportional to counts. Other options include "custom" or "arbitrary". So the definition (in _templ_enum.cif) might be "intensity: a value proportional to counts per second". We would need to enhance our definition of _units.code to note that any items having units with an unspecified proportionality constant have the same notional constant for all values in the same loop.

I'm particularly interested to hear what Brian T has to say about this as clearly this has been thought about during creation of the original pdCIF.

briantoby commented 1 year ago

I am not sure how to speak to this. Where intensities are scaled counts, inclusion of a scaling factor certainly makes sense, but my guess is that where an instrument takes counts and scales them to report data as intensities, the scaling factor will vary across the pattern to account for differing numbers of detectors, pixel counts, detector efficiency, etc. so the single scalar value in _units.code probably is of little value. If _units.code had existed when pdCIF had been created, perhaps I would not have created the _counts data item and would have used _units.code to distinguish between intensities and counts, but that ship has sailed.

I have mixed feelings about use of _units.code to indicate counts, cps, arbitrary,… as how does one interpret an intensity of _counts where _units.code is anything other than counts? My feeling is that should be considered as an error.

Brian

On Sep 28, 2022, at 5:22 AM, James Hester @.**@.>> wrote:

pdCIF distinguishes between "counts" (for which the square root gives the SU) and "intensities" (where some sort of scaling/merging/processing has been done, or the observations are not counts). A pdCIF data name is even available to separately provide the "units of intensity". DDLm, however, would like to attach a _units.code value to each definition that includes intensities. As far as I can tell, something like "intensity" might actually work as the _units.code value - as this conveys that the value is proportional to counts. Other options include "custom" or "arbitrary". So the definition (in _templ_enum.cif) might be "intensity: a value proportional to counts per second". We would need to enhance our definition of _units.code to note that any items having units with an unspecified proportionality constant have the same notional constant for all values in the same loop.

I'm particularly interested to hear what Brian T has to say about this as clearly this was been thought about during creation of the original pdCIF.

— Reply to this email directly, view it on GitHubhttps://github.com/COMCIFS/Powder_Dictionary/issues/25, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACH7E2DRF7QQDZCCEPERRJDWAQMAHANCNFSM6AAAAAAQXUARDY. You are receiving this because you are subscribed to this thread.Message ID: @.***>

rowlesmr commented 1 year ago

I read this as _pd_meas.counts_* should have a _units.code of counts, and _pd_proc.intensity_* should have a _units.code of intensity, where "intensity" is proportional to the number of photons detected and the scalar is not necessarily the same value for each data point.

.

Would we need a specific unit for _pd_proc.intensity_norm and _pd_proc.intensity_incident?

.

Also, I just noticied an inconsistency in the formatting of templ_enum.cif: 'photons per second' "photons registered in one second" with no '_' in the unit.

jamesrhester commented 1 year ago

@briantoby

I am not sure how to speak to this. Where intensities are scaled counts, inclusion of a scaling factor certainly makes sense, but my guess is that where an instrument takes counts and scales them to report data as intensities, the scaling factor will vary across the pattern to account for differing numbers of detectors, pixel counts, detector efficiency, etc. so the single scalar value in _units.code probably is of little value. If _units.code had existed when pdCIF had been created, perhaps I would not have created the _counts data item and would have used _units.code to distinguish between intensities and counts, but that ship has sailed.

I have mixed feelings about use of _units.code to indicate counts, cps, arbitrary,… as how does one interpret an intensity of _counts where _units.code is anything other than counts? My feeling is that should be considered as an error.

I think I wasn't clear enough here. _units.code is specified for a given data name in the dictionary, so can't be changed on a per data file basis, and I wasn't proposing to provide a particular scaling value. I'm just wondering what the _units.code should be for the *_intensity type data names - but I think I'm happy for it to be none (see below).

@rowlesmr:

Also, I just noticied an inconsistency in the formatting of templ_enum.cif: 'photons per second' "photons registered in one second" with no '_' in the unit.

If you like you can raise an issue in cif core, where I will comment separately (there are reasons).

Would we need a specific unit for _pd_proc.intensity_norm and _pd_proc.intensity_incident?

No. The idea is that units of "intensity" indicate only that the quantity is supposed to be proportional to true counts per second, with constant of proportionality unspecified and constant only for the same column in a loop (indeed I was wrong to suggest constant for the whole loop).

I've just checked cif_core (perhaps should have done that earlier). This issue may be moot as the core dictionary simply uses "none" for the units of intensity-type data names. I'm happy to go with that if nobody has any particular objections.