lbl-anp / becquerel

Becquerel is a Python package for analyzing nuclear spectroscopic measurements.
Other
43 stars 16 forks source link

Rebinning uses first and last bin as overflow bins #334

Closed markbandstra closed 2 years ago

markbandstra commented 2 years ago

When using bq.rebin or bq.Spectrum.rebin, the default behavior is to silently include all data outside the new binning scheme in the leftmost bin (if less than the leftmost bin edge) or in the rightmost bin (if greater than the rightmost bin edge). This behavior is unexpected and not documented. I think it should be controlled by a kwarg that defaults to disabling the behavior.

This behavior has been noted in #148 and #209, and #209 in particular notes the needed workaround: to add an extra bin at each end and then later remove them.

I just ran into a situation where I rebinned a spectrum but it still included the problematic data that I was trying to exclude, which was unhelpful and confusing.

Here is a MWE demonstrating the behavior:

import numpy as np
import becquerel as bq

counts = 1000 * np.ones(10)
spec = bq.Spectrum(counts=counts)
cal = bq.Calibration("p[0] * x", [1.0])
spec.apply_calibration(cal)
print("Original spectrum:")
print("edges: ", spec.bin_edges_kev)
print("values:", spec.counts_vals)

for edges in [
    np.linspace(2, 9, num=8),
    np.linspace(2.5, 9.5, num=8),
]:
    for method in ["interpolation", "listmode"]:
        spec2 = spec.rebin(edges, method=method)
        print("")
        print(f"Rebinned spectrum ({method}):")
        print("edges: ", spec2.bin_edges_kev)
        print("values:", spec2.counts_vals)

The output is:

Original spectrum:
edges:  [ 0.  1.  2.  3.  4.  5.  6.  7.  8.  9. 10.]
values: [1000. 1000. 1000. 1000. 1000. 1000. 1000. 1000. 1000. 1000.]

Rebinned spectrum (interpolation):
edges:  [2. 3. 4. 5. 6. 7. 8. 9.]
values: [3000. 1000. 1000. 1000. 1000. 1000. 2000.]

Rebinned spectrum (listmode):
edges:  [2. 3. 4. 5. 6. 7. 8. 9.]
values: [3000. 1000. 1000. 1000. 1000. 1000. 2000.]

Rebinned spectrum (interpolation):
edges:  [2.5 3.5 4.5 5.5 6.5 7.5 8.5 9.5]
values: [3500. 1000. 1000. 1000. 1000. 1000. 1500.]

Rebinned spectrum (listmode):
edges:  [2.5 3.5 4.5 5.5 6.5 7.5 8.5 9.5]
values: [3495. 1021. 1007.  961. 1005. 1005. 1506.]

Ideally all of the outputs of the rebinned spectra would be

[1000. 1000. 1000. 1000. 1000. 1000. 1000.]

or approximately those values. (Randomness will affect the final line.)