Negative total yields of up and down systematics causes NaN with NormPlusShape

rmnmllr commented 1 year ago

Dear @alexander-held, Unfortunately we have another problem with (most likely) negative yields.

In our analysis we have systematics with up and down variation histograms. We want to add one of these systematics, labelled JET_Pileup_RhoTopology, with a NormPlusShape modifier. I define it in the config with:

  - Name: "JET_Pileup_RhoTopology"
    Up:
      VariationPath: "JET_Pileup_RhoTopology__1up"
    Down:
      VariationPath: "JET_Pileup_RhoTopology__1down"
    Type: "NormPlusShape"
    Samples: ["multiboson", "singletop", "Zll", "Ztautau", "Wjets", "ttbar"]

Now reading in the config and loading all the histogram data works fine, but when cabinetry reaches the plotting with cabinetry.visualize.data_mc it aborts with:

/terabig/muellerr/.local/lib/python3.9/site-packages/matplotlib/axes/_axes.py:1185: RuntimeWarning: All-NaN axis encountered
  miny = np.nanmin(masked_verts[..., 1])
/terabig/muellerr/.local/lib/python3.9/site-packages/matplotlib/axes/_axes.py:1186: RuntimeWarning: All-NaN axis encountered
  maxy = np.nanmax(masked_verts[..., 1])
/terabig/muellerr/.local/lib/python3.9/site-packages/matplotlib/axes/_base.py:2503: UserWarning: Warning: converting a masked element to nan.
  xys = np.asarray(xys)
Traceback (most recent call last):
  File "/terabig/muellerr/Leptoquarks/fitting/run_cabinetry.py", line 532, in <module>
    run_cabinetry(args["inputs"],
  File "/terabig/muellerr/Leptoquarks/fitting/run_cabinetry.py", line 210, in run_cabinetry
    cabinetry.visualize.data_mc(prediction_prefit,
  File "/terabig/muellerr/.conda/envs/fit-dev-latest/lib/python3.9/site-packages/cabinetry/visualize/__init__.py", line 277, in data_mc
    fig = plot_model.data_mc(
  File "/terabig/muellerr/.conda/envs/fit-dev-latest/lib/python3.9/site-packages/cabinetry/visualize/plot_model.py", line 203, in data_mc
    ax1.set_ylim([y_min / 10, y_max * 10])
  File "/terabig/muellerr/.local/lib/python3.9/site-packages/matplotlib/_api/deprecation.py", line 454, in wrapper
    return func(*args, **kwargs)
  File "/terabig/muellerr/.local/lib/python3.9/site-packages/matplotlib/axes/_base.py", line 3884, in set_ylim
    return self.yaxis._set_lim(bottom, top, emit=emit, auto=auto)
  File "/terabig/muellerr/.local/lib/python3.9/site-packages/matplotlib/axis.py", line 1179, in _set_lim
    v0 = self.axes._validate_converted_limits(v0, self.convert_units)
  File "/terabig/muellerr/.local/lib/python3.9/site-packages/matplotlib/axes/_base.py", line 3572, in _validate_converted_limits
    raise ValueError("Axis limits cannot be NaN or Inf")
ValueError: Axis limits cannot be NaN or Inf

Tracing back the NaN I see already on the yield tables that the Zll samples have NaN for all bins. I investigated further and found that the NaN are most likely coming from the workspace.normplusshape_modifiers method giving back negative norm_effect_up and norm_effect_down. There must be a logarithm function somewhere then where the NaNs are produced.

They are negative because the sum(histogram_up.yields) and sum(histogram_down.yields) are negative due to a single bin (250 - 300 GeV) having a large negative value and dragging down the sum (see the debug print below).

Now If I set norm_effect_up = abs(norm_effect_up) and norm_effect_down = abs(norm_effect_down) I get around the NaN and have no problems. But how could we solve this problem? Is it a problem of our inputs or can cabinetry handle something to get around this issue?

Unfortunately I can't provide a quick minimal example to reproduce the problem, but from these debug prints it should be clear of what's going in and where the problem appears:

DEBUG - cabinetry.workspace - adding OverallSys and HistoSys JET_Pileup_RhoTopology to sample Zll in region HighMassOSs2thh_1bjets
INFO - cabinetry.workspace - histogram_nominal =                      ┌───────────────────────────────────────────────────────┐
[100,  150) 0.02517  │█████████████████████████████▉                         │
[150,  200) 1e-09    │                                                       │
[200,  250) 0.04546  │██████████████████████████████████████████████████████ │
[250,  300) 1e-09    │                                                       │
[300,  350) 0.0383   │█████████████████████████████████████████████▌         │
[350,  400) 0.001453 │█▊                                                     │
[400,  500) 0.003387 │████                                                   │
[500,  600) 1e-09    │                                                       │
[600, 1000) 1e-09    │                                                       │
                     └───────────────────────────────────────────────────────┘
INFO - cabinetry.workspace - histogram_up =                      ┌───────────────────────────────────────────────────────┐
[100,  150) 0.02517  │                                        ████████       │
[150,  200) 1e-09    │                                                       │
[200,  250) 0.04545  │                                        ██████████████▍│
[250,  300) -0.1249  │████████████████████████████████████████               │
[300,  350) 0.03832  │                                        ████████████▏  │
[350,  400) 0.001452 │                                        ▌              │
[400,  500) 0.003385 │                                        █▏             │
[500,  600) 1e-09    │                                                       │
[600, 1000) 1e-09    │                                                       │
                     └───────────────────────────────────────────────────────┘
INFO - cabinetry.workspace - histogram_down =                      ┌───────────────────────────────────────────────────────┐
[100,  150) 0.02517  │                                        ████████       │
[150,  200) 1e-09    │                                                       │
[200,  250) 0.04548  │                                        ██████████████▌│
[250,  300) -0.1246  │████████████████████████████████████████               │
[300,  350) 0.03871  │                                        ████████████▎  │
[350,  400) 0.001453 │                                        ▌              │
[400,  500) 0.003546 │                                        █▏             │
[500,  600) 1e-09    │                                                       │
[600, 1000) 1e-09    │                                                       │
                     └───────────────────────────────────────────────────────┘
INFO - cabinetry.workspace - sum(histogram_up.yields) = -0.011141039923554863
INFO - cabinetry.workspace - sum(histogram_down.yields) = -0.010211770333139802
INFO - cabinetry.workspace - sum(histogram_nominal.yields) = 0.11376828064207656
INFO - cabinetry.workspace - norm_effect_up = -0.09792747029908451, norm_effect_down = -0.08975937999157066
INFO - cabinetry.workspace - histo_yield_up = [-0.25704124736030715, -1.02116389677373e-08, -0.4640891997785347, -1.2755540125120497, -0.3912601185025125, -0.014824719301859525, -0.03457041629184243, -1.02116389677373e-08, -1.02116389677373e-08]
INFO - cabinetry.workspace - histo_yield_down = [-0.28036559944701234, -1.1140896603920158e-08, -0.5067228320944018, -1.387761567470008, -0.43121255209073955, -0.016189808801901934, -0.0395024609711858, -1.1140896603920158e-08, -1.1140896603920158e-08]

Please let me know if you need anything more.

Many thanks in advance! Roman

alexander-held commented 1 year ago

Hi, the short answer is that yes, this ultimately is a problem with the input. cabinetry doesn't provide any functionality to automatically do something to work around this, because there is not really a good solution here. There just are not enough events to properly estimate the distribution of this template.

Things you could try out:

exclude the sample from being affected by this systematic (arguing that this is an effect on a negligible background that will not matter),
manually override that specific histogram with something else (e.g. the same but setting empty bin to zero, could do that via https://cabinetry.readthedocs.io/en/latest/advanced.html#overrides-for-template-building),
merge the sample together with other samples such that all bins are filled and all systematic variations on the stack of samples also have all bins filled.

Not applicable in this specific case where the total sum is negative, but other things that can help if you have a negative bin:

apply smoothing to the systematic variation via "Smoothing": {"Algorithm": "353QH, twice"},
re-bin the distribution to avoid such bins.

rmnmllr commented 1 year ago

Thank you for you quick answer and suggestions, we will discuss internally on how to continue!

For the smoothing, can this be applied for all systematics with one setting in the config or do I have to do:

  - Name: "EG_RESOLUTION_ALL"
    Up:
      VariationPath: "EG_RESOLUTION_ALL__1up"
    Down:
      VariationPath: "EG_RESOLUTION_ALL__1down"
    Type: "NormPlusShape"
    Samples: ["multiboson", "singletop", "Zll", "Ztautau", "Wjets", "ttbar"]
    Smoothing:
      Algorithm: "353QH, twice"

for each systematics in the YAML? I did a quick test with/without smoothing but I don't see any difference in the saved workspace. How does this come into play?

alexander-held commented 1 year ago

Currently this is a per-systematic setting, so no easy way to apply this to everything at once. You could also split such a systematic that acts on multiple samples into two pieces, allowing you to only apply smoothing where needed:

  - Name: "EG_RESOLUTION_ALL"
    Up:
      VariationPath: "EG_RESOLUTION_ALL__1up"
    Down:
      VariationPath: "EG_RESOLUTION_ALL__1down"
    Type: "NormPlusShape"
    Samples: ["multiboson", "singletop", "Ztautau", "Wjets", "ttbar"]

  - Name: "EG_RESOLUTION_ALL_with_smoothing"
    ModifierName: "EG_RESOLUTION_ALL"
    Up:
      VariationPath: "EG_RESOLUTION_ALL__1up"
    Down:
      VariationPath: "EG_RESOLUTION_ALL__1down"
    Type: "NormPlusShape"
    Samples: ["Zll"]
    Smoothing:
      Algorithm: "353QH, twice"

Note the use of ModifierName here to ensure the nuisance parameters have the same name in the workspace and will end up being correlated.

The smoothing happens during cabinetry.templates.postprocess, are you perhaps skipping this step?

scikit-hep / cabinetry

Negative total yields of up and down systematics causes NaN with NormPlusShape #404