mantidproject / mantid

Main repository for Mantid code
https://www.mantidproject.org
GNU General Public License v3.0
210 stars 122 forks source link

Incorrect log filtering when using Plus on ISIS event data #36194

Open mducle opened 11 months ago

mducle commented 11 months ago

Original reporter: Ross and Chrisitan / ISIS

Describe the bug

When two event files/workspaces are added together using the Plus algorithm, the filter for the first workspace is used for the results/output workspace. This means that the sample log data for all subsequent workspaces in the sum is "masked" out.

It also means that ws.getRun().getLogData(blockname).filtered_value returns only the log data of the first workspace in the sums list.

To Reproduce

Plus should add the filtered sample logs for all workspaces keeping the filtering.

Screenshots

Platform/Version (please complete the following information):

Additional context

LET would like to run follow spectroscopic features as a function of a log parameter (e.g. sample temperature); they tend to record multiple runs as this log parameter is swept (e.g. on increasing temperature), then combine all the run files, and then re-partition it into smaller steps. They can use getLogData().value instead of getLogData().filtered_value but this would also include values when the beam is not on.

robertapplin commented 11 months ago

@mducle Thank you for the bug report. Just want to check is this bug in Mantid v6.7, or just in the nightly?

mducle commented 11 months ago

@robertapplin The log data for the RHSWorkspace of Plus is still filtered but in the nightly (version 6.7.20230929.1615) the Filtered data check box is now no longer grayed-out when you plot a particular block.

E.g. if you run the following script in nightly:

Load(Filename='LET00092786.nxs', OutputWorkspace='LET00092786')
Load(Filename='LET00092787.nxs', OutputWorkspace='LET00092787')
Plus(LHSWorkspace='LET00092786', RHSWorkspace='LET00092787', OutputWorkspace='added')

And right click added and "Show Sample Logs" - if you select any of the blocks which change (e.g. one of the chopper speed or phase or T_Float, T_Stick or T_Head) and check and uncheck the Filtered data check box you should see the data change significantly.

Edit. The LET data is in /archive/NDXLET/instrument/data/cycle_23_1/ or \\isis\inst$\NDXLET\Instrument\data\cycle_23_1.

robertapplin commented 11 months ago

Thanks for clarifying. The same behaviour is in Mantid v6.7 as well as the nightly (correct me if I'm wrong), so is not a regression that will need urgent fixing for v6.8. I'll add it to the v6.9 milestone

mducle commented 6 months ago

@GuiMacielPereira

The use case for this is to track the neutron signal as a function of some sample variable (such as sample temperature). The "traditional" way would be to set a temperature, wait for the sample to stabilise at that temperature and measure for a fixed amount of time (e.g. 2min), then move to another temperature. A more flexible mode is to instead measure over a longer period and to ramp the temperature (or some other variable such as magnetic field or sample rotation angle). For example we could measure for 1h ramping the temperature from 60 to 90K (for instance). Then depending on how the neutron signal change we could rebin data in blocks of 1K, 2K, 5K or whatever using the sample logs.

If we want to measure a larger temperature range or use a slower ramp speed we would then measure for a longer time - however this means a larger file. The issue is that the neutron counts (events) data is actually stored in the data acquisition electronics (DAE) whilst the measurement is in progress and then saved to a network drive only after the measurement ends. Larger file takes much longer to save than smaller files so we tend to keep the runs short. This means for longer ramps we need to sum together multiple files before splitting them up in terms of the different log values - which is where this bug breaks the script.

Ideally we would like to do:

var_name = 'Sample_Temp'
var_bin_size = 2

ws_sum = None
for run in run_list:
    ws = Load(run)
    if ws_sum is None:
        ws_sum = CloneWorkspace(ws)
    else:
        ws_sum = Plus(ws_sum, ws)

var_val = ws_sum.getRun().getLogData(var_name).filtered_value
var_range = max(var_val) - min(var_val)
var_nbins = int(var_range / var_bin_size)

for block in [x*var_bin_size + min(var_val) for x in range(var_nbins)]:
     val = round(block + var_bin_size/2, 1)
     ws = FilterByLogValue(ws_sum, var_name, block, block + var_bin_size)
     # Additional processing of ws

The issue is that with the above bug, the deduced var_range is wrong (it only spans the first file). If we use:

var_val = ws_sum.getRun().getLogData(var_name).value

(e.g. the unfiltered value), then we could over-estimate the real var_range because the DAE actually records the log value from the end of the previous run to the end of the current run. This could be a problem because e.g. let's say we cooled the sample to 5K and did a long measurement at 5K. Then we warm to 50K and measured a ramp from 50 to 100K. Because the DAE records the temperature from the end of the previous run (5K) rather than the start of the current run (50K), var_range would go from 5 to 100K instead of 50 to 100K as we wanted. However, the logs from the end of the previous run to the start of the current run would be filtered out.

The work-around I've been using is:

var_name = 'Sample_Temp'
var_bin_size = 2

ws_sum = None
for run in run_list:
    ws = Load(run)
    if ws_sum is None:
        ws_sum = CloneWorkspace(ws)
        var_val = ws.getRun().getLogData(var_name).filtered_value
    else:
        ws_sum = Plus(ws_sum, ws)
        tmp_val = ws.getRun().getLogData(var_name).filtered_value
        var_val = np.concatenate((var_val, tmp_val))

var_range = max(var_val) - min(var_val)
var_nbins = int(var_range / var_bin_size)