pyhf / pyhf-validation

Validation utilities for HistFactory workspaces
Apache License 2.0
2 stars 1 forks source link

ci: Add tests in pyhf-validation-root-base Docker image #9

Open matthewfeickert opened 4 years ago

matthewfeickert commented 4 years ago

Add tests that run inside of the pyhf/pyhf-validation-root-base Docker image (where ROOT 6.22.02 is installed). This is done by using the container job option, which allow for specification of an arbitrary Docker image to launch into.

A more in depth example of container based workflow can be found here.

Checklist Before Requesting Reviewer

Before Merging

For the PR Assignees:

* Add testing to the CI using the 'pyhf/pyhf-validation-root-base' Docker image
* Uses the 'container' option for GitHub Actions
* Fix replacement of lumi nuisance parameter in nuisance parameter comparison scripts
codecov[bot] commented 4 years ago

Codecov Report

Merging #9 (2ae3c60) into master (e7e3b3d) will not change coverage. The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master       #9   +/-   ##
=======================================
  Coverage   90.90%   90.90%           
=======================================
  Files           3        3           
  Lines          11       11           
=======================================
  Hits           10       10           
  Misses          1        1           
Flag Coverage Δ
unittests 90.90% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.


Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update e7e3b3d...2ae3c60. Read the comment docs.

lukasheinrich commented 4 years ago

we could create ROOT workspaces using json2xml + xml2workspace from the public sbottom likelihood

matthewfeickert commented 4 years ago

we could create ROOT workspaces using json2xml + xml2workspace from the public sbottom likelihood

Good idea. I should have thought of this given that you and @kratsg had already suggested this when we originally posted some TODOs in the Issues.

matthewfeickert commented 4 years ago

scripts/JSON_to_workspace.sh: line 45: hist2workspace: command not found

To my embarrassment, somehow the current version of the pyhf/pyhf-validation-root-base Docker image knows about PyROOT but doesn't have root or hist2workspace in PATH. :? This is very strange.

matthewfeickert commented 4 years ago

Self note: Things to follow up on later (edit: after fixing lumi string conversion)

Nuisance params unique to root:

- `compare_fitted_nuisance.py` gives

########### Printing nuisance parameter comparisons to screen #############

param pyhf val root val abs diff % diff

lumi 9.999041e-01 9.999047e-01 -5.789377e-07 -0.000058
EG_RESOLUTION_ALL 4.612360e-03 4.704046e-03 -9.168577e-05 -1.987828
EG_SCALE_ALL -7.499355e-03 -7.619664e-03 1.203096e-04 -1.604266
EL_EFF_ChargeIDSel_TOTAL_1NPCOR_PLUS_UNCOR0.000000e+00 0.000000e+00 0.000000e+00 0.000000
EL_EFF_ID_TOTAL_1NPCOR_PLUS_UNCOR -2.050934e-03 -2.061238e-03 1.030395e-05 -0.502403
EL_EFF_Iso_TOTAL_1NPCOR_PLUS_UNCOR -3.020117e-02 -3.018264e-02 -1.853162e-05 0.061361
EL_EFF_Reco_TOTAL_1NPCOR_PLUS_UNCOR 3.148846e-04 3.296803e-04 -1.479566e-05 -4.698755
EL_EFF_TriggerEff_TOTAL_1NPCOR_PLUS_UNCOR 0.000000e+00 0.000000e+00 0.000000e+00 0.000000
EL_EFF_Trigger_TOTAL_1NPCOR_PLUS_UNCOR 0.000000e+00 0.000000e+00 0.000000e+00 0.000000
FT_EFF_B_systematics -8.226148e-03 -7.876455e-03 -3.496934e-04 4.250998
FT_EFF_C_systematics -2.304282e-02 -2.247763e-02 -5.651925e-04 2.452793
FT_EFF_Light_systematics -1.442350e-01 -1.435668e-01 -6.681248e-04 0.463220
FT_EFF_extrapolation -1.017101e-01 -1.015515e-01 -1.585868e-04 0.155920
FT_EFF_extrapolation_from_charm -7.146362e-02 -7.149188e-02 2.826320e-05 -0.039549
JET_EtaIntercalibration_NonClosure_highE 5.297120e-05 6.585029e-05 -1.287909e-05 -24.313382
JET_EtaIntercalibration_NonClosure_negEta 1.053397e-03 1.075789e-03 -2.239210e-05 -2.125704
JET_EtaIntercalibration_NonClosure_posEta -6.598275e-04 -6.669702e-04 7.142699e-06 -1.082510
JET_Flavor_Response -1.211315e-01 -1.229163e-01 1.784819e-03 -1.473456
JET_GroupedNP_1 3.895620e-02 3.980564e-02 -8.494396e-04 -2.180499
JET_GroupedNP_2 8.902315e-02 9.045789e-02 -1.434742e-03 -1.611651
JET_GroupedNP_3 -8.649500e-02 -8.762272e-02 1.127723e-03 -1.303802
JET_JER_DataVsMC 3.874304e-03 3.914144e-03 -3.983995e-05 -1.028312
JET_JER_EffectiveNP_1 -1.367855e-02 -1.353319e-02 -1.453564e-04 1.062659
JET_JER_EffectiveNP_2 -9.716172e-03 -9.529539e-03 -1.866322e-04 1.920841
JET_JER_EffectiveNP_3 -1.004252e-02 -9.914179e-03 -1.283391e-04 1.277957
JET_JER_EffectiveNP_4 -1.157892e-02 -1.146194e-02 -1.169860e-04 1.010336
JET_JER_EffectiveNP_5 -1.125749e-02 -1.106993e-02 -1.875624e-04 1.666111
JET_JER_EffectiveNP_6 -7.609317e-03 -7.515067e-03 -9.425056e-05 1.238621
JET_JER_EffectiveNP_7restTerm -8.836154e-03 -8.466758e-03 -3.693960e-04 4.180507
JET_JvtEfficiency 6.653292e-03 6.788719e-03 -1.354272e-04 -2.035491
MET_SoftTrk_ResoPara -1.122048e-02 -1.120073e-02 -1.974629e-05 0.175984
MET_SoftTrk_ResoPerp -5.237057e-03 -5.170010e-03 -6.704759e-05 1.280253
MET_SoftTrk_Scale 2.668171e-02 2.720399e-02 -5.222818e-04 -1.957452
MUON_EFF_BADMUON_STAT 1.150481e-08 0.000000e+00 1.150481e-08 100.000000
MUON_EFF_BADMUON_SYS 4.984433e-07 0.000000e+00 4.984433e-07 100.000000
MUON_EFF_ISO_STAT -2.213125e-04 -2.165684e-04 -4.744056e-06 2.143600
MUON_EFF_ISO_SYS 2.798518e-04 2.900454e-04 -1.019354e-05 -3.642476
MUON_EFF_RECO_STAT 1.182098e-04 1.312382e-04 -1.302839e-05 -11.021413
MUON_EFF_RECO_SYS 5.625014e-04 5.741601e-04 -1.165876e-05 -2.072663
MUON_EFF_TTVA_STAT 5.297959e-07 1.151321e-05 -1.098341e-05 -2073.140503
MUON_EFF_TTVA_SYS -5.713945e-06 1.160404e-05 -1.731798e-05 303.082742
MUON_EFF_TrigStatUncertainty 1.094673e-07 0.000000e+00 1.094673e-07 100.000000
MUON_EFF_TrigSystUncertainty 1.872494e-07 0.000000e+00 1.872494e-07 100.000000
MUON_ID -5.514388e-03 -5.604341e-03 8.995379e-05 -1.631256
MUON_MS -7.825303e-03 -8.072155e-03 2.468518e-04 -3.154534
MUON_SAGITTA_RESBIAS 1.558436e-02 1.520452e-02 3.798361e-04 2.437291
MUON_SAGITTA_RHO 0.000000e+00 0.000000e+00 0.000000e+00 0.000000
MUON_SCALE -2.634326e-03 -2.654062e-03 1.973623e-05 -0.749195
SigRad 4.952022e-05 -2.134732e-04 2.629935e-04 531.082941
ttH_theory 2.189601e-02 2.325871e-02 -1.362701e-03 -6.223512
ttZ_theory 8.329884e-03 8.688973e-03 -3.590890e-04 -4.310852
ttbar_FSR -3.424808e-01 -3.421057e-01 -3.750301e-04 0.109504
ttbar_Gen 5.526187e-01 5.601142e-01 -7.495502e-03 -1.356361
ttbar_ISR_Down 6.604958e-01 6.682535e-01 -7.757710e-03 -1.174528
ttbar_ISR_Up 1.528636e-01 1.462661e-01 6.597444e-03 4.315903
ttbar_PS -1.729353e-01 -1.746909e-01 1.755532e-03 -1.015138
staterror_SR_meff_0 1.015946e+00 1.016074e+00 -1.282399e-04 -0.012623
staterror_SR_meff_1 9.810488e-01 9.810100e-01 3.874903e-05 0.003950
staterror_SR_meff_2 9.923182e-01 9.921207e-01 1.974996e-04 0.019903
mu_ttbar 9.580629e-01 9.584793e-01 -4.163798e-04 -0.043461
mu_SIG 3.662478e-14 3.293900e-07 -3.293900e-07 -899363724.041528



Things that aren't clear to me:
- Why ROOT doesn't have `CRtt` nuisance parameters
- Why some nuisance parameters are exactly zero for both ROOT and pyhf
- Why some nuisance parameters are exactly zero for only ROOT
- Why some nuisance parameters differ between ROOT and pyhf by as much as `O(e-03)`
kratsg commented 4 years ago
  • Why ROOT doesn't have CRtt nuisance parameters

because they're zero and automatically pruned. We don't do that in pyhf yet.

  • Why some nuisance parameters are exactly zero for both ROOT and pyhf

depends on the workspace/analysis. not concerning as long as both ROOT and pyhf agree

  • Why some nuisance parameters are exactly zero for only ROOT

likely ROOT auto-prunes below a certain value to 0 having it be negligible, and we just can't tell.

  • Why some nuisance parameters differ between ROOT and pyhf by as much as O(e-03)

all of these seem to be at the 1% difference level, so I'm not generally concerned.

matthewfeickert commented 4 years ago

JSON_to_workspace seems smarter to me to have a Makefile variant of that instead of a shell script.

I don't normally think of Makefiles as taking command line arguments outside of the target. What is the advantage here to hardcoding something in the Makefile? Or am I misunderstanding what you're suggesting?

kratsg commented 4 years ago

I don't normally think of Makefiles as taking command line arguments outside of the target. What is the advantage here to hardcoding something in the Makefile? Or am I misunderstanding what you're suggesting?

Makefiles don't take arguments really. You can do like make root-workspace and that will make files from input files -> output files. Think of it like a compilation, right? There's not a huge reason why a makefile wouldn't work instead?

matthewfeickert commented 4 years ago

Makefiles don't take arguments really

This is kinda my point. If we were to remove the Bash script and move all of that logic into a Make target then we'd have to hardcode everything.

There's not a huge reason why a makefile wouldn't work instead?

scripts/JSON_to_workspace.sh works for generic arguments. In the CI we can do

bash scripts/JSON_to_workspace.sh \
  sbottom \
  https://doi.org/10.17182/hepdata.89408.v1/r2 \
  RegionA/BkgOnly.json \
  RegionA/patch.sbottom_1300_205_60.json \
  RegionA/sbottom_1300_205_60.json

for the specific test there, but we can use this for analyses in the future. Why do we want to hardcode this one particular example case in a Makefile?

kratsg commented 4 years ago

This is kinda my point. If we were to remove the Bash script and move all of that logic into a Make target then we'd have to hardcode everything.

Makefiles don't hardcode names. They hardcode patterns. You can define a pattern target, e.g. patch.%.json would be an example, and that would automatically extract out sbottom_1300_205_60 and so on.

matthewfeickert commented 4 years ago

You can define a pattern target, e.g. patch.%.json would be an example, and that would automatically extract out sbottom_1300_205_60 and so on.

Doesn't this still hardcode a Region and a URL or require a user to download everything? This seems like a loss of functionality. Maybe I should be asking a different question(s): What do you see us gaining by making this all a Make target? What do you see as the disadvantages of using a Bash script that requires user arguments?

kratsg commented 4 years ago

No advantages or disadvantages. The thing with the bash script is you'll regenerate when you re-run (versus a Makefile which only re-runs / re-makes things that have their source changing).