scikit-hep / pyhf

pure-Python HistFactory implementation with tensors and autodiff
https://pyhf.readthedocs.io/
Apache License 2.0
279 stars 83 forks source link

Keras API (Tensorflow backend) Neural Network Interpolator #835

Open sjiggins opened 4 years ago

sjiggins commented 4 years ago

Question

Looking through the source code the interpolator package contains, as outlined in the HistFactory, the interpolation algorithms.

With this in mind, I'd like to create an interpolator that utilises a Keras API (Tensorflow backend) Neural Network as an interpolator, how should I proceed with this?

As a follow up, is it possible to set from the spec = {} pdf model syntax the interpolation method for a modifier?

Relevant Issues and Pull Requests

No other issues or relevant pull requests exist.

lukasheinrich commented 4 years ago

Hi @sjiggins,

we have a similar project here

https://github.com/lukasheinrich/pyhf-gpsys

that uses sklearn Gaussian Processes as an interpolator. This is definitely in scope but might require a bit of thought on the API side.

sjiggins commented 4 years ago

Hi @lukasheinrich,

Thanks for the link, this is informative on how to add an interpolator. I was reading through several of the demonstrations, specifically this one:

https://github.com/pyhf/pyhf-gpsys/blob/master/Demo.ipynb

From this I am unsure as to how for this part of the code:

'modifiers': [
        {
        'name': 'mygp',
        'type': 'gpsys',
        ..............

How the type gpsys is visible to the infer.mle.fit() method that I'd eventually like to use. Sorry if this is already answered in the code/demo.

Kind Regards

lukasheinrich commented 4 years ago

Hi @sjiggins, can you detail your use-cases a bit. Curious about in what context this is needed. Generally the Model object will have all the interpolation fixed so infer.mle.fit will just use whatever is defined in the model

sjiggins commented 4 years ago

Hi @lukasheinrich,

First a declaration of my understanding.

I believe the interpolator package contains all interpolation algorithms that are used to interpolate between, nom_data and the lo/hi_data points defined when constructing a 'type=histosys' modifier.

The idea here is that instead of using a polynominal interpolation for each bin in a given histogram, I'd like to use a neural network to serve as the interpolation function. This NN is trained to interpolate between the nom_data and lo/hi_data points as a function of the nuisance parameter \alpha.

Therefore I need to essentially construct a python object that is called when performing the interpolation for a specified pdf model.

Hope this is clear.

Kind Regards

sjiggins commented 4 years ago

Hi @lukasheinrich & @matthewfeickert,

Looking through the code and some offline discussion I believe what I wish to achieve is a two new features:

1) Create an interpolation class that takes a keras built/trained NN and stores it internally, plus stores all input data so as to re-sample and generate a new expectation for each histogram.

2) Build a sampler that takes the unbinned data, and populates a user defined sampling grid of points so as to replace lo/hi_data with many points for a given nuisance parameter value.

For the former I believe I need to build something similar to one of the interpolation classes, but for the latter is there any recommendation on how one would then use the interpolation classes to take more than 3 points (i.e. points for specific values of the nuisance parameter)?

Thank you @lukasheinrich for a fruitful discussion a couple of weeks ago on this matter.

Kind Regards Stephen Jiggins