mantidproject / mslice

Source code for Mantid MSlice
http://mantidproject.github.io/mslice
1 stars 2 forks source link

Cut algorithm "Integration" not correct where there is no detector coverage #835

Open mducle opened 1 year ago

mducle commented 1 year ago

Describe the bug

Originally when computing a 1D cut, for each bin in the cut axis, it would average the counts in the integration axis (the "Rebin" algorithm). Recently an addition algorithm ("Integration") which summed the counts in the integration axis was introduced.

In general where there are no detector gaps, the two algorithms will produce the same curves except with a scaling factor depending on the bin size. However, where there is a detector gap (e.g. where there is no data for some range of the integration axis for a given cut-axis-bin) then the two curves will be qualitatively different, with the average-counts ("Rebin") algorithm effectively extrapolating data from cut-axes-bins where there is data, whilst the sum-counts ("Integration") algorithm not accounting for the missing data at all, only summing the integration axis where there is data.

However, the current approach with the "Integration" algorithm means that the counts in cut-axis bins which span regions in the integration axes bins that have no data effectively under-count neutrons compared to other cut-axis bins:

image

The horizontal axis is the cut-axis; the vertical axis is the integration axis. The bottom row is the output 1D-cut, which sums the counts in each column (along the integration axis). Blue pixels are where there is data, white where there is no data which could be due to physical restrictions or detector gaps.

Pixels coloured green are where data exists for every integration-axis bin for a particular cut-axis bin. Pixels coloured yellow are where there are some gaps (denoted by white). In principle the yellow pixels are under-counted compared to neighbouring blue pixels.

At the moment, MSlice returns the full row, including both green and yellow pixels, but the yellow pixels (which are not particularly highlighted to users) could give a misleading picture to users and comparing green and yellow pixels is not strictly valid.

Possible solutions

Some possible solutions with advantages and drawbacks are:

  1. Change the default behaviour to exclude yellow pixels - they will be marked as NaN leaving gaps in the output 1D cut but ensuring that all bins within the cut is strictly valid to be compared with other bins. a. Pro: the output is transparent to the user - they can trust that they are receiving valid cuts. b. Con: breaks backwards compatibility (cuts using this new default will not agree with previous versions), but without raising this - users would need to inspect the result and compare to previous versions (or read the changelog) to know about the changes

  2. Change the default behaviour such that if an integration range includes detector gaps to return an error, with a suggestion for new ranges. a. Pro: Notifies the user of the changes, and the reason for it and allows them to change their behaviour. If there is an option to override the error, the previous default behaviour can be recovered b. Con: Previously working scripts might now error out.

  3. Add a new option to either exclude yellow pixels or error but leave the current default behaviour as is: a. Pro: Does not break existing scripts / is backwards compatible b. Con: Can give users incorrect results, because includes bins which strictly are not comparable.

  4. Change the default behaviour as per 1) or 2) but add an option to override to get the previous default - and to note this in an error or warning message. a. Pro: as per 1 or 2 but with benefit of allowing old scripts to still give consistent result. b. Con: Users can still get incorrect results if they insist.

Discussion point

Notes

Original reporter: @mincentatties Need comments from: @mincentatties @davidvoneshen

mincentatties commented 1 year ago

Thanks for this Duc Happy to discuss this further

R

davidvoneshen commented 1 year ago

This should be closed. The integration routine is integrating the available data. Here we have a problem with the skew in detector trajectories introduced when we convert to Q from 2theta. The solution is to work in 2theta space where you have not introduced the skewing. This is already supported.

Fundamentally, you need to have some idea of the object you are interacting with and the routines being used. It was (and perhaps remains) an entirely valid complaint that it wasn't clear what the routine in mslice was. I think that has been addressed but if it is felt to be unclear then yes we should discuss how we can tidy it up. I am also in favour of keeping track of units so that routines return the most helpful information however that is not an mslice issue, rather a whole workflow one.

Regarding the specific suggestions for improvement, if we are not to close this (as I really believe we should) only no3 should be considered. It avoids breaking backward compatibility. If discarding bins with Nans were chosen then for some instruments a very significant quantity of data would be discarded. It should not be the default. Furthermore, I am strongly against adding additional bits onto the mslice GUI and would advocate for it to be accessible only through the CLI. The mslice GUI is becoming more cluttered and we need to try and keep it as clean as we can for the benefit of new users.

Regarding the discussion points

mincentatties commented 1 year ago

My preferred option is number 1.

Integration and rebinning should be available to the user as options, and I agree that rebinning should be the default. However, it should not be possible for mslice to return incorrect results with either option.

  1. If the backwards compatibility is only broken because a bug has been removed – then that, for me, is a valid reason for breaking backwards compatibility.
  2. Cutting in 2 theta is different from cutting in Q. And – largely – it is meaningless, since 2 theta is arbitrary and contains no sample information.
  3. We should – of course – discard data which integrates over NaNs. This becomes increasingly important where there are a lot of gaps – like on MAPS.
  4. For the discussion points: happy to discuss these further. But interpolation is generally the enemy of accurate experimental uncertainties.
mducle commented 1 year ago

@mincentatties It would be helpful to ground the discussion if you could maybe provide some data files and example cuts and what features of the cuts are incorrect to you - is it entirely the edges of the cuts due to kinematic constraints or "dips" or other things in the middle of the cuts due to detector gaps (I know I used gaps for both in the original issue, sorry).

robertapplin commented 1 year ago

The decision from the meeting is that we should add an option to be able to "exclude the yellow pixels". This option will be turned off by default for the Rebin algorithm, and will be turned on by default for the Integration algorithm.

The current results output from the Integration algorithm with this above problem are incorrect, and so any concerns breaking backwards compatibility are negated.

robertapplin commented 1 year ago

I believe the problem is seen in this screenshot:

image