scikit-hep / pyhf

pure-Python HistFactory implementation with tensors and autodiff
https://pyhf.readthedocs.io/
Apache License 2.0
284 stars 84 forks source link

Making pyhf differentiable #882

Open phinate opened 4 years ago

phinate commented 4 years ago

Description

It's become clear to me through the discussions that have taken place in IRIS-HEP and gradhep that it would be awesome to have pyhf as part of the modelling step in a differentiable analysis workflow.

Having backends that support autograd is most of the work, but me and @lukasheinrich have found through experiments in the development of neos that there are still operations that don't play nice with the differentiability of pyhf.Model construction, for instance. This is also just in the situation of using pyhf.tensor.jax_backend() -- I haven't tried this with other backends.

Pull requests/issues

There's already been an attempt to make part of this differentiable in #742, which I made small modifications to in my fork of this branch. (I make an explicit call to override the backend at one point, which isn't pretty...)

Scope

So far, this idea has only been used for the model construction and likelihood evaluation steps, but I wonder if this could also extend to inference in pyhf.infer? This could influence design decisions made for a library to provide differentiable HEP operations, which could be imported into pyhf.infer down the line if the actual underlying functionality doesn't change, e.g. wrapping a scipy optimizer with the two-phase method in the fax library.

All this seems big enough a task that I felt like it warranted an issue as a documented way forward :)

kratsg commented 4 years ago

Rather than overriding defeault_backend to the current tensorlib -- we should provide a way for users to specify an explicit **kwargs = {backend=default_backend} and pass things through as needed. This should maintain backwards-compatibility on some level -- to first attempt.

We generally want to support users switching backends transparently without a huge amount of re-computation.

phinate commented 4 years ago

cc'ing @kratsg about this again (and the relevant PR in #742)

kratsg commented 4 years ago

@phinate I just went ahead and did the rebase. Try it now. To be honest, there's not a ton of changes now...

lukasheinrich commented 3 years ago

I'm reactivating this based on the recent #1625 PR. The goal is to make this fucnttion diffable (and removiing the .tolist() calls.

def makeit(
    pars,
    nominal,
    uncorr_data,
    corrup_data,
    corrdn_data,
    stater_data,
    normsys_up,
    normsys_dn,
    lumi_sigma,
  ):
  spec = {
      "channels": [
          { "name": "secondchannel",
            "samples": [
              { "name": "background",
                "data": nominal,
                "modifiers": [ 
                  { "name": "mu", "type": "normfactor", "data": None},
                  { "name": "lumi", "type": "lumi", "data": None },
                  { "name": "mod_name", "type": "shapefactor", "data": None },
                  { "name": "uncorr_bkguncrt2", "type": "shapesys", "data": uncorr_data} ,
                  { "name": "corr_bkguncrt2", "type": "histosys", "data": {'hi_data': corrup_data, 'lo_data': corrdn_data}},
                  { "name": "staterror", "type": "staterror", "data": stater_data},
                  { "name": "norm", "type": "normsys", "data": {'hi': normsys_up, 'lo': normsys_dn}},
                ]
              }
            ]
          }
      ],
  }
  model = pyhf.Model({'channels': spec['channels'], 'parameters': [{'name':'lumi', 'auxdata': [1.0], 'bounds': [[0.5,1.5]], 'inits': [1.0], "sigmas": [lumi_sigma]}]})

  exp_data = model.expected_data(pars)

  return model.logpdf(pars,exp_data)

makeit(
  pars = tb.astensor([0.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]).tolist(),
  nominal = tb.astensor([60.0, 62.0]).tolist(),
  uncorr_data = tb.astensor([5.0, 9.0]).tolist(),
  corrup_data = tb.astensor([65,65]).tolist(),
  corrdn_data = tb.astensor([55,55]).tolist(),
  stater_data = tb.astensor([10.,10]).tolist(),
  normsys_up = tb.astensor(1.05).tolist(),
  normsys_dn = tb.astensor(0.95).tolist(),
  lumi_sigma= tb.astensor(0.017).tolist(),
)
kratsg commented 3 years ago

On my to do list. I know how to make it happen.

phinate commented 3 years ago

Great to see efforts on this again, super exciting for me in particular :)

~Just as an FYI, if you're able to commit to your user forks of pyhf for now @kratsg @lukasheinrich until functionality is confirmed, that would be preferred, since I currently have a few examples working that checkout the current state of this PR.~

Edit: ah, this is the issue! I saw a message about a rebase and assumed otherwise...

Of course, this wouldn't interfere with #742, my mistake :P