Closed phyy-nx closed 8 months ago
I purport this doesn't need a vote as it doesn't change functionality, and the change to NXmx only clarifies functionality that was already there. Feedback welcome!
@phyy-nx
The elements in data are usually float values really. For efficiency reasons these are usually stored as integers after scaling with a scale factor. This value is the scale factor. It is required to get the actual physical value, when necessary.
The original description for the gain sounds like stored_integer_value = physical_value * the_scale_factor
, making the definition of the gain the same as in MX (i.e. value per photon). But your new description is the other way round. Did you check the interpretation at the NIAC?
I know you confirmed the order of the gain and the offset, but I don't know if the definition of the gain was confirmed. This post is just to double check it.
Ack, I think you might be right and I misinterpreted this field!
The elements in data are usually float values really. For efficiency reasons these are usually stored as integers after scaling with a scale factor. This value is the scale factor. It is required to get the actual physical value, when necessary.
So I think I now agree this is saying the stored value has already had the scaling factor applied. But that reads differently from offset:
An optional offset to apply to the values in data.
That implies the offset hasn't been implied yet!
So, @nexusformat/developers, how do you use scaling_factor and offset in NXdata, if at all?
Bump! @nexusformat/developers :)
How do you use scaling_factor and offset in NXdata, if at all?
I don't use these fields but it seems that if you want to document the stored values then stored_value = (physical_value - offset) * scaling_factor
may be best for the pedestal use but other people want to use offset
as a bias: stored_value = physical_value * scaling_factor + offset
. So two more interpretations to choose from!
I have always interpreted these fields as things that need to be done to the stored value
in order to get to the physical value
. I would do this as:
physical value = stored value * scaling_factor + offset
I second https://github.com/nexusformat/definitions/issues/1333#issuecomment-1864730138 by @benajamin with the exception that I always think about stored value vs. plotted value (meaning the coordinate in the plotting coordinate system) since NXdata is meant to represent "data to be plotted"
plotted value = stored value * scaling_factor + offset
Ok I like switching the discussion to "stored" vs. "plotted". Based on that, just to recap without math, there are two general ways these can be interpreted:
scaling_factor
and offset
before you plot the datascaling_factor
and offset
for you, and I'm noting the values I usedI hope the answer is 1. to match @biochem-fan's use case, but it seems to line up with some of the comments above.
I do note that both @benajamin and @woutdenolf switched the order of operations to match the "bias" version that @PeterC-DLS suggested, compared to the pedestal version that I originally suggested:
a. plotted_value = physical_value * scaling_factor + offset
b. plotted_value = (physical_value + offset) * scaling_factor
a. makes more intuitive sense to me as it more closely matches the equation of a line y=mx+b
. But the consequence is that for our MX data that we need to store "gain-corrected" pedestals... But that's not relevant for @biochem-fan's use case (no pedestal I believe), so maybe a. is fine?
I've made the change to use the "plotted" nomenclature for NXdata, but I've kept the pedestal formula for now, until we get a bit more discussion here (plotted_value = (physical_value + offset) * scaling_factor
). Mainly because it's I'm coming from an MX background where pedestal is applied first in the use cases I know about. But I can be convinced otherwise! Especially if folks already have code with the other convention (plotted_value = physical_value * scaling_factor + offset
)
LGTM. For clarity's sake in your comments, I think stored_value
is better than physical_value
as you may be plotting the "physical" value.
Feedback from Telco addressed. If there's no more comments we'll send this to a vote.
Hello, please vote by providing an emoji on this comment. Thanks.
Not happy with the term pedestal
in this context. Is it the same as offset
?
@prjemian pedestal
, used only for NXmx, does mean offset, and I assumed it was a known term in x-ray diffraction land. I could be convinced that gain and pedestal should be defined more clearly, but that could be done in a separate, not-voted on PR, since it would only be a documentation clarification.
Since offset
is the term that has been used with NXdata, let's not change it in this PR.
Wait it's not being changed in the PR. For NXdata it's FIELDNAME_scaling_factor
and FIELDNAME_offset
, and for NXmx is data_scaling_factor
and data_offset
.
Pedestal is just a documentation term. Does that help?
Depending on the field, various terms are used: "offset", "pedestal" and "bias". For example, ThermoFisher's electron detectors use "bias".
Perhaps keeping "offset" for NXdata is a good idea but I would prefer to have other term mentioned as well in the NXmx documentation for better searchability.
Should reserved suffixes be updated too?
Yes
On Wed, Jan 31, 2024, 3:35 AM Peter Chang @.***> wrote:
Should reserved suffixes https://manual.nexusformat.org/datarules.html#reserved-suffixes be updated too?
— Reply to this email directly, view it on GitHub https://github.com/nexusformat/definitions/pull/1333#issuecomment-1918726950, or unsubscribe https://github.com/notifications/unsubscribe-auth/AARMUMAQXZF42W4M7HS25JLYRIF6LAVCNFSM6AAAAAA7VKHPNOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMJYG4ZDMOJVGA . You are receiving this because you were mentioned.Message ID: @.***>
Happy to modify reserved suffixes in a different PR after this vote.
LGTM :smile:
Vote did not pass (got 12 votes, needed 13 for quorum), which is fine given all the discussion. For the sake of clarity I'm closing this PR, removing it and the associated issue from the milestone, and I will make a new PR that addresses the feedback here. We'll try again then.
Closes #1332
Also adds clarification in NXmx that these fields can be used as pedestal and gain correction fields, as well as elaborates on the possible rank options. These rank options were implied (in my opinion) in the original wording, but in NXmx I made it more explicit.
Tagging @biochem-fan for reference