Test of parts of esspolarization workflow on polarized ZOOM data, with basis of esssans workflow

astellhorn commented 2 months ago

Main goal: I am going step by step through the functions defined in esspolarization base.py and try to use them on the polarized test data inside the esssans workflow. The goal is to find out where to insert the polarized workflow into the unpolarized workflow, and how to treat Direct Beam (through He-cells) and Sample data, and also if the esspolarization definitions have to be changed.

What works

pol.RunSectionLog(results[MaskedData[SampleRun]],
pol.RunSectionLog(iofq['++']),

What does not work

compute_direct_beam, e.g. used in pol.DirectBeamNoCell, see below for the whole code

Questions

what do I do wrong when functions of the base.py from esspolarization are not found? Error message: module 'ess.polarization' has no attribute 'ReducedDirectBeamDataNoCell

Code for not working DirectBeamNoCell computation

def compute_direct_beam(
    data: sc.DataArray,
    q_range: sc.Variable,
    background_q_range: sc.Variable,
) -> sc.DataArray:
    """Compute background-subtracted direct beam function."""
    start_db = q_range[0]
    stop_db = q_range[-1]
    start_bg = background_q_range[0]
    stop_bg = background_q_range[-1]
    # The input is binned in time and wavelength, we simply histogram without changes.
    direct_beam = data.bins['Q', start_db:stop_db].hist()
    background = data.bins['Q', start_bg:stop_bg].hist()
    return direct_beam - background

q_range = sc.array(dims=['Q'], values = [0,0.1], unit='1/Angstrom')
start_db = q_range[0]
stop_db = q_range[-1]
print(start_db)
print(stop_db)
background_q_range = sc.array(dims=['Q'], values = [0.1,0.3], unit='1/Angstrom')
start_bg = background_q_range[0]
stop_bg = background_q_range[-1]
print(start_bg)
print(stop_bg)

data = iofq['++']

pol.DirectBeamNoCell(
        compute_direct_beam(
            data=data,
            #data = iqxqy,
            q_range=q_range,
            background_q_range=background_q_range,
      )
    )

Error Message

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[117], line 7
      1 data = iofq['++']
      2 #data = MaskedData[SampleRun]['++']
      3 #q_range = sc.linspace(dim='Q', start=0, stop=0.1, num=101, unit='1/angstrom')
      6 pol.DirectBeamNoCell(
----> 7         compute_direct_beam(
      8             data=data,
      9             #data = iqxqy,
     10             q_range=q_range,
     11             background_q_range=background_q_range,
     12       )
     13     )  

Cell In[110], line 19, in compute_direct_beam(data, q_range, background_q_range)
     17 stop_bg = background_q_range[-1]
     18 # The input is binned in time and wavelength, we simply histogram without changes.
---> 19 direct_beam = data.bins['Q', start_db:stop_db].hist()
     20 background = data.bins['Q', start_bg:stop_bg].hist()
     21 return direct_beam - background

File /opt/mamba/envs/mantid_env/lib/python3.10/site-packages/scipp/core/bins.py:208, in Bins.__getitem__(self, key)
    205         elif index.stop is None:
    206             stop = start
--> 208     return self._obj.bin({dim: concat([start, stop], dim)}).squeeze(dim)
    209 raise ValueError(
    210     f"Unsupported key '{key}'. Expected a dimension label and "
    211     "a 0-D variable or a dimension label and a slice object with start "
    212     "and stop given by a 0-D variable."
    213 )

File /opt/mamba/envs/mantid_env/lib/python3.10/site-packages/scipp/core/data_group.py:623, in data_group_overload.<locals>.impl(data, *args, **kwargs)
    621 if isinstance(data, DataGroup):
    622     return data.apply(impl, *args, **kwargs)
--> 623 return func(data, *args, **kwargs)

File /opt/mamba/envs/mantid_env/lib/python3.10/site-packages/scipp/core/binning.py:593, in bin(x, arg_dict, **kwargs)
    591 edges = _make_edges(x, arg_dict, kwargs)
    592 erase = _find_replaced_dims(x, edges)
--> 593 return make_binned(x, edges=list(edges.values()), erase=erase)

File /opt/mamba/envs/mantid_env/lib/python3.10/site-packages/scipp/core/binning.py:151, in make_binned(x, edges, groups, erase)
    149 if _can_operate_on_bins(x, edges, groups, erase):
    150     return combine_bins(x, edges=edges, groups=groups, dim=erase)
--> 151 return _cpp.bin(x, edges, groups, erase)

KeyError: "Expected 'Q' in <scipp.Dict.keys {wavelength, Qy, Qx, gravity, incident_beam, L1}>."

SimonHeybrock commented 2 months ago

As the errors indicates, the function expected Q, but there is only Qx and Qy. So the question is: Do we need to compute Q, or should the function be rewritten so it uses Qx and Qy?

astellhorn commented 2 months ago

Yes actually Qx and Qy is even better here. We thought it does in principle not matter for the direct beam as it should be symmetric but now I also remember cases where e.g. the slits are not symmetric such that also direct beam is not symmetric. Also, we will handle the other data rather in Qx Qy instead of Q, so its better to be constant there I guess.

astellhorn commented 2 months ago

Can I simply redefine the q_range of def_compute_direct_beam from from currently 1D to 2D via changing

def compute_direct_beam(
    data: sc.DataArray,
    q_range: sc.Variable,
    background_q_range: sc.Variable,
) -> sc.DataArray:
    """Compute background-subtracted direct beam function."""
    start_db = q_range[0]
    stop_db = q_range[-1]
    start_bg = background_q_range[0]
    stop_bg = background_q_range[-1]
    # The input is binned in time and wavelength, we simply histogram without changes.
    direct_beam = data.bins['Q', start_db:stop_db].hist()
    background = data.bins['Q', start_bg:stop_bg].hist()
    return direct_beam - background

to

def compute_direct_beam(
    data: sc.DataArray,
    qx_range: sc.Variable,
    qy_range: sc.Variable,
    background_qx_range: sc.Variable,
    background_qy_range: sc.Variable,
) -> sc.DataArray:
    """Compute background-subtracted direct beam function."""
    start_db_qx = qx_range[0]
    start_db_qy = qy_range[0]
    stop_db_qx = qx_range[-1]
    stop_db_qy = qy_range[-1]
    start_bg_qx = background_q_range[0]
    start_bg_qy = background_q_range[0]
    stop_bg_qx = background_q_range[-1]
    stop_bg_qy = background_q_range[-1]
    # The input is binned in time and wavelength, we simply histogram without changes.
    direct_beam = data.bins['Qx', 'Qy', start_db_qx:stop_db_qx, start_db_qy:stop_db_qy].hist()
    background = data.bins['Qx', 'Qy', start_bg_qx:stop_bg_qx, start_bg_qy:stop_bg_qy].hist()
    return direct_beam - background

?

SimonHeybrock commented 2 months ago

We cannot select 2 dims at the same time, so it should be something like

def compute_direct_beam(
    data: sc.DataArray,
    q_range: sc.Variable,
    background_q_range: sc.Variable,
) -> sc.DataArray:
    """Compute background-subtracted direct beam function."""
    start_db = q_range[0]
    stop_db = q_range[-1]
    start_bg = background_q_range[0]
    stop_bg = background_q_range[-1]
    # The input is binned in time and wavelength, we simply histogram without changes.
    direct_beam = data.bins['Qx', start_db:stop_db]['Qy', start_db:stop_db].hist()
    background = data.bins['Qx', start_bg:stop_bg]['Qy', start_db:stop_db].hist()
    return direct_beam - background

assuming we want the same range in both X and Y.

astellhorn commented 2 months ago

Ok ! but then it still has to be:

q_range = sc.array(dims=['Q'], values = [0,0.1], unit='1/Angstrom')
start_db = q_range[0]
stop_db = q_range[-1]

?

astellhorn commented 2 months ago

It looks correct and is working, but if then I try computing the direct beam from the iofq of the example data, it seems to miss one dimension, it only yields Qy, but not Qx:

pl[PolarizationSetting] = '++'
iofq = pl.compute(IofQ[SampleRun])
iofq

Result: Screenshot 2024-04-26 at 10 09 22

pol.DirectBeamNoCell(
        compute_direct_beam(
            data=iofq,
            #data = iqxqy,
            q_range=q_range,
            background_q_range=background_q_range,
      )
    )

Result: Screenshot 2024-04-26 at 10 10 40

SimonHeybrock commented 2 months ago

Sorry about that, I forgot a second bins, try this:

def compute_direct_beam(
    data: sc.DataArray,
    q_range: sc.Variable,
    background_q_range: sc.Variable,
) -> sc.DataArray:
    """Compute background-subtracted direct beam function."""
    start_db = q_range[0]
    stop_db = q_range[-1]
    start_bg = background_q_range[0]
    stop_bg = background_q_range[-1]
    # The input is binned in time and wavelength, we simply histogram without changes.
    direct_beam = data.bins['Qx', start_db:stop_db].bins['Qy', start_db:stop_db].hist()
    background = data.bins['Qx', start_bg:stop_bg].bins['Qy', start_db:stop_db].hist()
    return direct_beam - background

You will see that the output has no binning now, since the event filtering selected sub-bins. If you need new Qx and Qy bins you have to bin again.

astellhorn commented 2 months ago

In the current esspolarization workflow, we assume one big dataset with one nxs file that contains the different measurements and from which we can extract "SampleInBeamLog", "CellInBeamLog", etc.
However, at ESS it will be coming in seperate .nxs files per measurement and Simon had already the idea to merge it like this: see #19
The ISIS data has again a different structure, but can probably be divided into its single parts that then can be read in exactly the same as what Simon mentioned in #19.

(1) Can the ideas from #19 be included into esspolarization? Probably via exchanging the RunSectionLog by the idea in #19? (2) For the ISIS data, Johannes has inserted the readin of the different spin-channels, so probably I can use these as a start to test the workflow?

I am collecting ideas and testing at the same time, I hope you dont mind me posting my thoughts here, and if you have any suggestions to comment on it.

astellhorn commented 2 months ago

@SimonHeybrock in #19, what do you define as "background" and what as "SampleRun"? Is for you the SampleRun everything with sample in and Background Run everything else, e.g. the directbeam measurements through the He cells [depolarized, polarized]? Or is Background for you here the Background of the detector during each run, i.e., in a predefined Q-range far away from the direct beam and any peaks in I(Q)?

SimonHeybrock commented 2 months ago

@astellhorn We have actually been working on a solution for the problem discussed in #19, but using a different, more general approach. It is nearly ready, but still needs some work. Is this something you need urgently right now?

SimonHeybrock commented 2 months ago

@SimonHeybrock in #19, what do you define as "background" and what as "SampleRun"? Is for you the SampleRun everything with sample in and Background Run everything else, e.g. the directbeam measurements through the He cells [depolarized, polarized]? Or is Background for you here the Background of the detector during each run, i.e., in a predefined Q-range far away from the direct beam and any peaks in I(Q)?

I used terminology from regular unpolarized SANS measurements. Background runs are those that are subtracted from the final I(Q). I am not sure background-subtraction is performed in polarized SANS? And if it is, I don't know which components are in the beam for background runs.

astellhorn commented 2 months ago

@astellhorn We have actually been working on a solution for the problem discussed in #19, but using a different, more general approach. It is nearly ready, but still needs some work. Is this something you need urgently right now?

I was thinking of how to go on testing the workflow functions, for example the opacity. In the workflow we use the data from direct beam with and without cell, for which in principle the workflow uses the RunSectionLog to determine which run is what, so it would be nice to test it together with the correctly sorted data.

I can also go ahead first to use an easier example without having used the RunSectionLog before and just tell manually which data to use for the calculation of opacity from beamdata, but then I test only if the fitting algorithm itself works, not if it works together with the datastructuring and readout in which spinstate/in which positions etc. are analyzer + polarizer.

astellhorn commented 2 months ago

@SimonHeybrock in #19, what do you define as "background" and what as "SampleRun"? Is for you the SampleRun everything with sample in and Background Run everything else, e.g. the directbeam measurements through the He cells [depolarized, polarized]? Or is Background for you here the Background of the detector during each run, i.e., in a predefined Q-range far away from the direct beam and any peaks in I(Q)?

I used terminology from regular unpolarized SANS measurements. Background runs are those that are subtracted from the final I(Q). I am not sure background-subtraction is performed in polarized SANS? And if it is, I don't know which components are in the beam for background runs.

Ok, then I know! Thank you.

SimonHeybrock commented 2 months ago

@astellhorn We have actually been working on a solution for the problem discussed in #19, but using a different, more general approach. It is nearly ready, but still needs some work. Is this something you need urgently right now?

I was thinking of how to go on testing the workflow functions, for example the opacity. In the workflow we use the data from direct beam with and without cell, for which in principle the workflow uses the RunSectionLog to determine which run is what, so it would be nice to test it together with the correctly sorted data.

I can also go ahead first to use an easier example without having used the RunSectionLog before and just tell manually which data to use for the calculation of opacity from beamdata, but then I test only if the fitting algorithm itself works, not if it works together with the datastructuring and readout in which spinstate/in which positions etc. are analyzer + polarizer.

The entire bit with RunSectionLog in the workflow draft was written when I/we thought that everything will be in a single big event-data file. As the data you have now is not in this format (and maybe it is not even clear if it will be at ESS?), we should probably consider ditching that approach, for now at least?

astellhorn commented 2 months ago

@astellhorn We have actually been working on a solution for the problem discussed in #19, but using a different, more general approach. It is nearly ready, but still needs some work. Is this something you need urgently right now?

I was thinking of how to go on testing the workflow functions, for example the opacity. In the workflow we use the data from direct beam with and without cell, for which in principle the workflow uses the RunSectionLog to determine which run is what, so it would be nice to test it together with the correctly sorted data. I can also go ahead first to use an easier example without having used the RunSectionLog before and just tell manually which data to use for the calculation of opacity from beamdata, but then I test only if the fitting algorithm itself works, not if it works together with the datastructuring and readout in which spinstate/in which positions etc. are analyzer + polarizer.

The entire bit with RunSectionLog in the workflow draft was written when I/we thought that everything will be in a single big event-data file. As the data you have now is not in this format (and maybe it is not even clear if it will be at ESS?), we should probably consider ditching that approach, for now at least?

All people here assume (at the moment) that it will be done via single nxs files for each measurement. Thats why I thought it would be nice to integrate what you had proposed in #19.

astellhorn commented 2 months ago

@astellhorn We have actually been working on a solution for the problem discussed in #19, but using a different, more general approach. It is nearly ready, but still needs some work. Is this something you need urgently right now?

I was thinking of how to go on testing the workflow functions, for example the opacity. In the workflow we use the data from direct beam with and without cell, for which in principle the workflow uses the RunSectionLog to determine which run is what, so it would be nice to test it together with the correctly sorted data. I can also go ahead first to use an easier example without having used the RunSectionLog before and just tell manually which data to use for the calculation of opacity from beamdata, but then I test only if the fitting algorithm itself works, not if it works together with the datastructuring and readout in which spinstate/in which positions etc. are analyzer + polarizer.

The entire bit with RunSectionLog in the workflow draft was written when I/we thought that everything will be in a single big event-data file. As the data you have now is not in this format (and maybe it is not even clear if it will be at ESS?), we should probably consider ditching that approach, for now at least?

All people here assume (at the moment) that it will be done via single nxs files for each measurement. Thats why I thought it would be nice to integrate what you had proposed in #19.

In preperation for the next steps some more detailed information in the new issue #43

scipp / esspolarization

Test of parts of esspolarization workflow on polarized ZOOM data, with basis of esssans workflow #39