DiamondLightSource / httomo

High-throughput tomography pipeline
https://diamondlightsource.github.io/httomo/
BSD 3-Clause "New" or "Revised" License
4 stars 3 forks source link

Investigate how to tackle cropping timestamps in raw data #369

Open yousefmoazzam opened 2 weeks ago

yousefmoazzam commented 2 weeks ago

Some beamlines collect data such that there is a timestamp in the top left corner of projections. Leaving in this information can cause the processing of methods to behave unexpectedly (for example, incorrect normalisation performed), so it is best to crop out this region by cropping the detector_y dimension in some way.

This issue serves as a discussion place for how to go about doing this in a way that users would find easy.

yousefmoazzam commented 2 weeks ago

One simple idea mentioned in a Slack thread by myself:

I'd lean towards keeping it simple for now: have a loader template for data that can be copy+pasted for cropping out such a timestamp, or have the users enter it themselves. Anything being done "behind the scenes" feels like it's going to require some knowledge of which data is given to httomo (ie, if the data has the timestamp or not), and that sounds not very nice to be checking stuff like "is this I12 data -> it has timestamp, so crop automatically"

dkazanc commented 2 weeks ago

The timestamps contain either zeros/ones or the values close to the limit of the data type (65535). In normalisation we seem to avoid division by zero, but then remove_nans should be always enabled to remove the consequences of log(0) or log(-value). Currently this is set to false, may be we should change it to true by default?

We might also want to alert the user that the data contains "too many" zeroes of values close to the precision limit. May be counting those elements in raw data won't be a big burden?

dkazanc commented 2 weeks ago

quick follow-up. May be we shouldn't terminate a job but warn the user of the percentage of those values in the raw data. I'd say the levels higher than 5% should be alarming.

jessicavers commented 1 week ago

You may have already discussed this further, or I may be unaware of other details, so you may not need to reply. Relating to your comment about a loader template:

have a loader template for data that can be copy+pasted for cropping out such a timestamp, or have the users enter it themselves

Would you mean using the standard loader with an alternative preview value? Or standard loader with an additional parameter to specify the exclusion on time stamp? or a slightly different loader?

If it was the extra parameter suggestion, would it be something like this:

'crop_timestamp' This parameter will crop the 10 data points collected across the whole of one axis. By default, the axis chosen is [.,*,.]

Relating to Daniil's comment:

We might also want to alert the user that the data contains "too many" zeroes of values close to the precision limit.

It does sound like a piece of feedback the user would want to be aware of if they weren't already.

yousefmoazzam commented 1 week ago

You may have already discussed this further, or I may be unaware of other details, so you may not need to reply. Relating to your comment about a loader template:

have a loader template for data that can be copy+pasted for cropping out such a timestamp, or have the users enter it themselves

Would you mean using the standard loader with an alternative preview value? Or standard loader with an additional parameter to specify the exclusion on time stamp? or a slightly different loader?

If it was the extra parameter suggestion, would it be something like this:

'crop_timestamp' This parameter will crop the 10 data points collected across the whole of one axis. By default, the axis chosen is [.,*,.]

@jessicavers I was meaning using the standard loader with a value for the preview parameter set to the desired value for a specific beamline. For example, for I12 data that needs the timestamp cropped out, have a loader template with the standard loader that has a preview value filled in which we know crops the timestamp out, like

preview:
detector_y:
start: 30

The suggestion was trying to provide the "simplest" solution in that it doesn't do anything clever, nor does it require a new feature or code changes (like an additional parameter or a different loader would). Simply a pre-written loader config for a beamline that can be copy+pasted.

dkazanc commented 1 week ago

I think we shouldn't really do anything in addition from the loader point of view. We just need to make sure that the pipeline won't be exited based on some nan's / inf's in some pixels. I think previously it was due to rescale_to_int raising an error, which is fixed now. The warning to the users on some issues like that is useful though but requires a different implementation. I'll create a different issue. @yousefmoazzam you created a similar issue somewhere, sorry cannot find it.

yousefmoazzam commented 1 week ago

The warning to the users on some issues like that is useful though but requires a different implementation. I'll create a different issue. @yousefmoazzam you created a similar issue somewhere, sorry cannot find it.

Possibly #267, though I'm not 100% sure if that's the one you mean?

jessicavers commented 1 week ago

Thank you for clarifying A loader template, using the standard loader, with a specific preview value.

A further suggestion, maybe including a message/recommendation within the verbose comment for the preview parameter? Drawbacks would be that it adds extra text. I'm also not sure that many users would know that -vv exists or would notice.

Or the possibility of a warning message, "I12 users should be aware that there may be a timestamp present. If so, we recommend preview value: " - perhaps slightly long.

jessicavers commented 1 week ago

Okay, yes, that sounds good