scikit-hep / cabinetry

design and steer profile likelihood fits

https://cabinetry.readthedocs.io/

BSD 3-Clause "New" or "Revised" License

26 stars 21 forks source link

perf: cache yield_stdev using spec and interpcodes of model #322

Closed lhenkelm closed 2 years ago

lhenkelm commented 2 years ago

This implements the changes to hashing discussed in #315.

It was necessary to modify one test that was directly looking up cached results in the yield_stdev cache.

* caching for yield uncertainty calculation now based on model specification and interpolation codes
* caching now works when re-creating pyhf.pdf.Model objects

codecov[bot] commented 2 years ago

Codecov Report

Merging #322 (766573b) into master (40120c2) will not change coverage. The diff coverage is 100.00%.

@@            Coverage Diff            @@
##            master      #322   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           23        23           
  Lines         1881      1889    +8     
  Branches       305       306    +1     
=========================================
+ Hits          1881      1889    +8

Impacted Files	Coverage Δ
src/cabinetry/model_utils.py	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 40120c2...766573b. Read the comment docs.

alexander-held commented 2 years ago

Thank you very much for the PR, this all looks good to me. I'd like to see whether we can get a resolution of https://github.com/scikit-hep/pyhf/issues/1762#issuecomment-1030356153, but as far as I can tell this has no impact for the equality comparison done here (where we care about equal expected_data output). Maybe worth adding another sentence to the docstring to prevent possible future confusion: the key ignores measurement config information, but as far as I can tell that matters for neither logpdf nor expected_data, and only for configuration information like config.suggested_bounds().

lhenkelm commented 2 years ago

Thanks for reviewing! I added the warning to the docstring. In the specific case of yield_stdev I agree this makes no difference, since no fits will be done and a full fit result is passed in alongside the model. In general the best thing would be to have pyhf objects that are deliberate about how they hash (in so far as they model values), since capturing everything would involve diving deep into the tree of pyhf objects, many of them internal to pyhf.