scikit-hep / pyhf

pure-Python HistFactory implementation with tensors and autodiff
https://pyhf.readthedocs.io/
Apache License 2.0
281 stars 83 forks source link

Constraint pdf for models without constraint terms #1390

Open alexander-held opened 3 years ago

alexander-held commented 3 years ago

Question

Slightly related to #1371, it is not possible to use Model.constraint_logpdf for models without constraint terms. This is an edge case, and one might argue that this should not work if no constraint term is present. I would suggest either catching this with an error, or returning 0, unless a different part of the API could be used to get the likelihood contribution per term?

Example:

import pyhf

spec = {
    "channels": [
        {
            "name": "Signal_region",
            "samples": [
                {
                    "data": [25.0],
                    "modifiers": [
                        {"data": None, "name": "mu", "type": "normfactor"},
                    ],
                    "name": "Signal",
                },
                {"data": [100.0], "modifiers": [], "name": "Background"},
            ],
        }
    ],
    "measurements": [
        {"config": {"parameters": [], "poi": "mu"}, "name": "minimal_example"}
    ],
    "observations": [{"data": [112.0], "name": "Signal_region"}],
    "version": "1.0.0",
}

workspace = pyhf.Workspace(spec)
model = workspace.model()
data = workspace.data(model)

# Poisson term
main_data = pyhf.tensorlib.astensor(data)
print(pyhf.tensorlib.poisson_dist(main_data).log_prob(main_data))

# constraint term
aux_data = pyhf.tensorlib.astensor([])  # model has no auxdata
pars = pyhf.tensorlib.astensor(model.config.suggested_init())
print(model.constraint_logpdf(aux_data, pars))

returns

[-3.27893201]
Traceback (most recent call last):
  File "test.py", line 37, in <module>
    print(model.constraint_logpdf(aux_data, pars))
  File "[...]/pyhf/src/pyhf/pdf.py", line 658, in constraint_logpdf
    return self.make_pdf(pars)[1].log_prob(auxdata)
  File "[...]/pyhf/src/pyhf/probability.py", line 244, in __getitem__
    return self._pdfobjs[index]
IndexError: list index out of range

since the pdf built in https://github.com/scikit-hep/pyhf/blob/98bb222d9c45ad13335ab5423d866e3a1508c329/src/pyhf/pdf.py#L658 has only one component.

Relevant Issues and Pull Requests

1371

lukasheinrich commented 3 years ago

I agree we should catch this, but in some ways the user already knows that this call will fail. Since you need to somehow construct aux_data from scratch so you know the model is constraint-less and calling model.constraint_logpdf(aux_data,pars) does not make sense.

what would be your preferred behavior?

alexander-held commented 3 years ago

I am not sure what behavior I would prefer. I think one way a user can construct aux_data without knowing it will be empty is via slicing

main_data = pyhf.tensorlib.astensor(data[:model.config.nmaindata])
aux_data = pyhf.tensorlib.astensor(data[model.config.nmaindata:])

The least disruptive solution may be to return 0.0 as evaluated constraint term if no term is actually present. I think raising an error is also fine, and maybe more informative. Users who evaluate the terms separately likely already know enough about the model so that neither behavior would be an issue.