`reader2D_converter` error calculations are proper sketchy

SasView / sasdata

Package for loading and handling SAS data

BSD 3-Clause "New" or "Revised" License

1 stars 2 forks source link

`reader2D_converter` error calculations are proper sketchy #45

Open lucas-wilkins opened 1 year ago

lucas-wilkins commented 1 year ago

@ehewins was asking about this code, which appears to assume that errors are derived from shot noise.

    if data2d.err_data is None or np.any(data2d.err_data <= 0):
        new_err_data = np.sqrt(np.abs(new_data))

Looking at the use cases, such as loading .tiff files, shot noise seems like a pretty unsound assumption. Is there anything wrong with just not specifying errors when they are not known?

butlerpd commented 1 year ago

Humm... Don't know this bit of code but in principle I am a very strong proponent for not inventing something that is not in the data. That just seems wrong?

However, if this is about the image converter (you mention TIFF?) that might be because it is meant to deal with raw data from lab SAXS? There was some discussion years ago about that and I don't remember them fully, though @smk78 may remember more? The gist as I recall was that we were getting a lot of pleas from lab SAXS users who had tiff files and no way to easily process their data (don't ask me why) to provide a way to get their image into SasView. They also then like to use the data operations to "reduce" their data I believe?

At the very least I would think a popup option in the converter asking the user if that is what is expected?

lucas-wilkins commented 1 year ago

There's no way that the chosen errors even remotely represent the noise, is there?

lucas-wilkins commented 1 year ago

@timsnow might also have some thoughts.

butlerpd commented 1 year ago

? If the images is of raw counts as collected from the detector then would that not in fact be the uncertainty? I believe that is what all current reduction programs use.

Though you are starting to open a can of worms because as far as I know, almost all reduced data (processed) only contains those (counting statistics) errors appropriately propagated but ignore all of the many other systematics that contribute to the processed data -- but guessing that is not the question here?

lucas-wilkins commented 1 year ago

If they were raw counts, then that would indeed be the right formula for shot noise.

I thought SAXS was dominated by other uncertainties though.

butlerpd commented 1 year ago

🤐

lucas-wilkins commented 1 year ago

In any case, I don't think it's right to assume that all images are of raw counts, and even if it was, the thing that makes that assumption should probably live in the image reader, not a "data converter" in a general location.

butlerpd commented 1 year ago

I agree - we should not be making random assumptions about data we are ingesting. That is why I suggested it be an option that the user selects somehow.

However, I think there should be input from people with more knowledge about this. There may be reasons it was done that way (maybe it is clear somewhere that the feature is ONLY for raw counts?) Unfortunately @krzywon is out through next week. I think @gonzalezma may have been involved in writing part of that code? and, as I say, I think @smk78 was at least partially responsible for asking for the convert features and worked with those who were implementing?