scikit-hep / pyhf

pure-Python HistFactory implementation with tensors and autodiff
https://pyhf.readthedocs.io/
Apache License 2.0
283 stars 83 forks source link

JSON schema validation failure when loading Model with dummy spec #327

Closed dantrim closed 6 years ago

dantrim commented 6 years ago

Description

I am not entirely sure what is incorrect with a dummy spec that I have created for loading into a Model. I attempted using first a JSON spec created from loading in my HistFactory XMLs and then ended up whittling it back down to a dummy spec with as few things as possible and then went back to creating a python dictionary OTF within some test code.

Expected Behavior

That the Model can get loaded with the input python dictionary.

Actual Behavior

A lot of terrible looking JSON schema validation which is difficult to use to pin point at what point in the JSON file the error is located.

Steps to Reproduce

git clone -b master https://github.com/diana-hep/pyhf.git
cd pyhf/
export PYTHONPATH=${PWD}:${PYTHONPATH}
cd ..
python test_schema.py

where test_schema.py is attached. The output I see is also attached as "failure.log".

failure.log.zip test_schema.py.zip

kratsg commented 6 years ago

Hi @dantrim , in the future, just copy-paste the code into github to make it easier to debug (or use gist.github.com). The error is because at the top-level path, it's looking for {'channels': [...]} so you need to change

pdf = Model(source['channels'])

to

pdf = Model(source)

and it should work.

kratsg commented 6 years ago

Additionally, you can see this from the spec we define here: https://github.com/diana-hep/pyhf/blob/master/pyhf/data/spec.json#L3-L8 .

dantrim commented 6 years ago

Thanks @kratsg that worked for my example there (I had made a copy & paste error based on the example notebook where it passes source['channels'] but source in that case is not a model spec). However, when I now use the JSON created by readxml from a full HistFitter workspace I get a similar problem. There are 30 NP and ~6 samples with a couple of control regions so understanding the output is a bit more challenging...

The JSON model spec created by is attached (wwbb_config_full.json.zip).

wwbb_config_full.json.zip

kratsg commented 6 years ago

Well, the JSON created by readxml contains a toplvl, so what you need to do is something like the command-line interface we provide (https://github.com/diana-hep/pyhf/blob/master/pyhf/commandline.py#L51-L91). But to simplify it, your procedure is something like

    parsed_xml = pyhf.readxml.parse('validation/xmlimport_input2/config/example.xml',
                                    'validation/xmlimport_input2')
    spec = {'channels': parsed_xml['channels']}
    pdf = pyhf.Model(spec, poiname='SigXsecOverSM')

as taken from https://github.com/diana-hep/pyhf/blob/master/tests/test_import.py#L54-L69.

kratsg commented 6 years ago

This is something we do need to improve on as things are slightly separated.

dantrim commented 6 years ago

Ah! Ok I see, thanks @kratsg. I will give that a try a bit later and let you know how it goes. Thanks for the super quick replies.