scikit-hep / cabinetry

design and steer profile likelihood fits
https://cabinetry.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
27 stars 21 forks source link

Overflow bin #374

Open nkang9 opened 1 year ago

nkang9 commented 1 year ago

I was wondering if there is an easy way to enable the use of overflow bins when specifying the binning for a region in the configuration. My current workaround is modifying this line in histogram_creator.py to enable the overflow although this also requires making a few modifications in histo.py to get the overflow working there as well. I guess one could always just define the binning in such a way that the last bin would effectively act as the overflow bin but I was thinking maybe there could be something more convenient like an additional property in the configuration that could enable this. Also for my specific case I was only interested in the overflow bin but this would apply for underflow bin as well.

alexander-held commented 1 year ago

Hi @nkang9! There is currently no convenient way to enable this, the easiest way is probably the method you mention of picking the bin edges in a way that no events end up in under/overflow bins. Thanks for bringing up this topic! Having a new config option to support a more convenient workflow sounds like a good idea to me.

Which behavior would you expect? Would you like the overflow bin to be a separate (new) bin that enters the workspace, or instead have all events from the under-/overflow be added to the outermost bins (which means that the number of bins in the workspace stays the same)?

I am not quite sure how to properly handle these bins in axis labels for visualizations (like data / MC plots) where the definition of what it means to be a flow bin might not be easily obtained in an automatic way.

nkang9 commented 1 year ago

Hi @alexander-held , Thanks for the prompt response, for now I think I will stick to using the temporary workaround I described. In terms of how this might be implemented I don't have a strong opinion, although the addition of events to the outermost bins is what I have done in my workaround so maybe I would have a slight preference to that sort of behavior. I guess in the case of single bin regions they would perhaps need to be handled in a special manner also.

For the visualizations I don't have any good ideas to share either on how one would denote flow bins (as usually I just mention it by word/text externally from the plot).

nkang9 commented 1 year ago

Sorry I accidentally closed the issue with my last comment.