Error when changing bins in VR

ekourlit commented 2 years ago

I'm opening this issue for an error I face when I change the bins in one of my regions.

I have a multiple region fit setup, where all the regions are single binned, except one which is multi binned. When I change the binning of the multi bin region, I get an AssertionError from this pyhf line. This is not happening with every binning, I have other bin configurations that work fine (even with empty bins): example1, example2.

I am using cabinetry version 0.4.1 and pyhf version 0.6.3. I note that the exact above erroneous line do not exist in current master branch of pyhf.

The fit configuration can be found here. The steering notebook here. And the error text here.

Any insight will be helpful! Thanks in advance!

alexander-held commented 2 years ago

Hi @ekourlit, thanks for reporting this! My guess is that some of the bins are empty after changing the binning, and they will have zero staterror. This causes the crash during model building that you observe because zero-rate Poissons were not supported by pyhf in v0.6.3. https://github.com/scikit-hep/pyhf/pull/1657 changed this, which would explain why you do not see the crash anymore with master.

Here is an example workspace for this:

{
    "channels": [
        {
            "name": "SR",
            "samples": [
                {
                    "data": [10.0, 0.0],
                    "modifiers": [
                        {
                            "data": [1.0, 0.0],
                            "name": "staterror_SR",
                            "type": "staterror"
                        },
                        {"data": null, "name": "mu", "type": "normfactor"}
                    ],
                    "name": "Signal"
                }
            ]
        }
    ],
    "measurements": [
        {
            "config": {"parameters": [], "poi": "mu"},
            "name": "minimal_example"
        }
    ],
    "observations": [{"data": [15, 0], "name": "SR"}],
    "version": "1.0.0"
}

which will fail at the assertion you point out:

assert len(sigmas[sigmas > 0]) == pdfconfig.param_set(mod).n_parameters

If this is indeed the explanation for what you observe, then the next release of pyhf should fix the behavior. Note that mixing the master of pyhf with cabinetry version 0.4.1 can cause problems due to the paramset.suggested_fixed_as_bool API change, so in that case it may be best to use the fix/pyhf-070 branch of cabinetry with the master of pyhf.

Two other things that can help:

setting both the staterror and the nominal sample yield in the empty bins to some small number (e.g. 10e-6), which could be done in cabinetry (or manually as a post-processing step after building a workspace),
selective pruning of staterror terms, which would need to be supported by pyhf (https://github.com/scikit-hep/pyhf/issues/662).

A post-processing step may be a decent stop-gap solution until the next pyhf release. The pruning approach will only work if there is at least one other sample with non-zero nominal yield, since otherwise the total model yield per bin will still be zero and minimization will fail (in pyhf version 0.6.3, also fixed in master).

ekourlit commented 2 years ago

Thanks @alexander-held for the quick reply! That was exactly the problem! I previously checked for zero yields in each sample/bin but the problem was actually the staterror of the sum of the samples/bin.

As a temporary workaround I modified the workspace with a post-processing step as you mentioned. I looped over the bins (nominal yields and staterror modifier) and substituted zeros with 1e-5. Below is the utility function for reference. Although I would probably need to add another loop if I include systematic variations.

def zeroless_ws(workspace):
    '''
    Utility function to look all the bins of the workspace and substitute zeros (nominal and statterror) with 1e-5
    this is a temporary fix unlti the next release of cabinetry+pyhf that could handle zero statterror
    '''
    # loop over regions
    for region_idx in range( len(workspace['channels']) ):
        region = workspace['channels'][region_idx] #dict
        # print("region name %s" % region['name'])

        # loop over samples
        for sample_idx in range( len(region['samples']) ):
            sample = region['samples'][sample_idx] #dict
            # print(sample) #dict

            # loop over sample data bins and replace zeros
            for bin_idx in range( len(sample['data']) ):
                counts = sample['data'][bin_idx]
                # print(counts)
                if counts == 0.0:
                    print("found an empty bin! \t substituting with 1e-5...")
                    sample['data'][bin_idx] = 1e-5

            staterror_modifier = sample['modifiers'][0] #dict
            # double check this is a statterror type
            if not staterror_modifier['type'] == 'staterror':
                print("another type of modifier found! \t breaking!")
                break
            # loop over sample modifiers staterror bins and replace zeros
            for bin_idx in range( len(staterror_modifier['data']) ):
                counts = staterror_modifier['data'][bin_idx]
                # print(counts)
                if counts == 0.0:
                    print("found an empty staterror bin! \t substituting with 1e-5...")
                    staterror_modifier['data'][bin_idx] = 1e-5

# fix workspace, i.e. replace zero bin yields and staterrors with 1e-5
zeroless_ws(ws)

alexander-held commented 2 years ago

Thank you for confirming that this was the issue in your setup. Given that the next release of pyhf will resolve this, I'll link this issue to be resolved by https://github.com/scikit-hep/cabinetry/pull/301 (no changes needed in that PR to address this specifically, but it introduces compatibility with the next pyhf release and thereby will solve this issue too).

We could also consider adding this workaround to cabinetry directly. I am personally in favor of avoiding these kind of "magic" fixes where possible, since they may be unintuitive to users. I am happy to do so though if there is interest.

scikit-hep / cabinetry

Error when changing bins in VR #340