Open guitargeek opened 1 year ago
One of the project goals is to support setting up the workspace for likelihood fits purely from Python dictionaries, without using RooFit objects or JSON string literals.
One good target for this is the creation of HistFactory models, which can be done by importing a full HS3 JSON as described here in this tutorial: https://root.cern/doc/master/rf515__hfJSON_8py.html
With the PRs that were already merged, creating the HistFactory pdfs from dictionaries already works. But the dataset specification still must go over string literals, as shown in this simplified version of the tutorial:
# Simplified version of the HistFactory JSON IO tutorial:
# https://root.cern/doc/master/rf515__hfJSON_8py.html
# You can also find it in the tutorials/roofit folder of the ROOT repo.
import ROOT
# Python dictionary specifying the model pdf
model_channel1 = {
"axes": [{"name": "obs_x_channel1", "max": 2.0, "min": 1.0, "nbins": 2}],
"samples": [
{
"data": {"contents": [20, 10]},
"modifiers": [
{"data": {"hi": 1.05, "lo": 0.95}, "name": "syst1", "type": "normsys"},
{"name": "mu", "type": "normfactor"},
],
"name": "signal",
},
{
"data": {"contents": [100, 0], "errors": [5, 0]},
"modifiers": [
{"data": {"hi": 1.05, "lo": 0.95}, "name": "syst2", "type": "normsys"},
{"name": "mcstat", "type": "staterror"},
],
"name": "background1",
},
{
"data": {"contents": [0, 100], "errors": [0, 10]},
"modifiers": [
{"data": {"hi": 1.05, "lo": 0.95}, "name": "syst3", "type": "normsys"},
{"name": "mcstat", "type": "staterror"},
],
"name": "background2",
},
],
"type": "histfactory_dist",
}
# Python dictionary specifying the binned dataset
observed_channel1 = {
"axes": [{"name": "obs_x_channel1", "nbins": 2, "min": 1, "max": 2}],
"contents": [122, 112],
"type": "binned",
}
# Creating an empty workspace
ws = ROOT.RooWorkspace("workspace")
# Importing the HistFactory pdf from a dictionary specification already works!
ws["model_channel1"] = model_channel1
# It would be nice if the user can also specify the datasets like this, such
# that no string literals are necessary to specify everything necessary for the
# likelihood analysis (note this doesn't work yet):
#
# ws["observed_channel1"] = observed_channel1
# Right now, the only way to import dataset via the JSON IO is to read a full
# HS3 JSON:
ROOT.RooJSONFactoryWSTool(ws).importJSONfromString(
"""
{
"distributions": [
],
"data": [
{
"name": "observed_channel1",
"axes": [
{
"name": "obs_x_channel1",
"nbins": 2,
"min": 1,
"max": 2
}
],
"contents": [122, 112],
"type": "binned"
}
]
}
"""
)
# Both the model_channel1 and the observed_channel1 should be in the workspace now.
ws.Print()
pdf = ws["model_channel1"]
data = ws["observed_channel1"]
# Fit the model pdf to the data to see if things work
result = pdf.fitTo(data, Save=True, PrintLevel=-1)
result.Print()
This workflow should be supported without string literals, meaning it would be good to also support the creation of binned datasets from dictionaries.
Pythonic interaction with the RooWorkspace
This issue tracks the progress on the GSoC project on the Pythonic interaction with the RooWorkspace: https://hepsoftwarefoundation.org/gsoc/2023/proposal_RooFit-RooWorkspacePythonization.html
This project was assigned to @yashnator.
Milestones and TODOs
__setitem__
on the workspace (#12911)RooWorksapce.__setitem__
(#12994)dict
to string conversion to the C++ sideJSONInterface
insteadMerged PRs