Fit crashes when there is negative entry

Tomoya-Iizawa commented 1 year ago

Hello,

I think the fit by the cabinetry crashes if there is negative entry bin in the input histograms. Is there any way to avoid the crash as changing the bounds of the fit?

Best Regards, Tomoya

alexander-held commented 1 year ago

This depends on the source of the negative predictions. If you have a sample with an attached normalization factor that is allowed to be negative, and that is what causes negative predictions, you can just change the bounds of that normalization factor (via the cabinetry configuration or via the par_bounds keyword argument that the cabinetry.fit functions support).

There is also functionality provided by pyhf to prevent negative bin yields by clipping the output. See https://github.com/scikit-hep/pyhf/pull/1845 and also https://github.com/scikit-hep/pyhf/issues/2093. Note that this is not a good solution and the problem really should be fixed at its source: negative bin predictions are unphysical. If the issue is related to low MC statistics (insufficient number of MC events), then the solution can be simulating more events or merging bins together.

Tomoya-Iizawa commented 12 months ago

Thank you for your reply, we will discuss it can be avoided by some modification of the histograms. In the mean while, could you tell me how to set par_bounds or in configuration? The negative bin comes from Systemtics sample, and it is implemented in the configuration file as e.g.

Name: "JET_Flavor_Response" Up: VariationPath: "JET_Flavor_Response1up" Down: VariationPath: "JET_Flavor_Response1down" Type: "NormPlusShape" Samples: ["multiboson", "singletop", "Zll", "Ztautau", "ttbar"]

Best, Tomoya

alexander-held commented 12 months ago

You can use the par_bounds argument to provide your custom parameter bounds:

par_bounds = model.config.suggested_bounds()  # default bounds
# get index of a specific parameter by name (in this case "Modeling")
idx = cabinetry.model_utils._parameter_index("Modeling", model.config.par_names)
par_bounds[idx] = (-3, 3)  # update bounds of this parameter
fit_results = cabinetry.fit.fit(model, data, par_bounds=par_bounds)

Note that the model_utils._parameter_index belongs to the internal API as indicated by the leading _ in the name, which signals that this function may change without additional notice. It is a quite simple function though:

par_index = next((i for i, label in enumerate(labels) if label == par_name), None)

so if you could implement it externally as well (and I do not have any concrete plans of changing it).

There is no way to set these bounds from the config. Adding support for it there is also not easy as a given parameter can be implicitly defined by various correlated systematics blocks so I don't think it fits in well there.

Closing this as I think there is no action required from the side of cabinetry but feel free to re-open if needed.

scikit-hep / cabinetry

Fit crashes when there is negative entry #424