Closed nucleosynthesis closed 6 months ago
FYI this will noticeably slow down text2workspace for large models, as the errline
dictionary is very large. This was also seen in #791 as we tried to reduce the memory footprint.
Thanks Nick - I suspect for many of these, we can revert to the standard dict as most cases will have a benign effect - the main one seems to be the ordering of the constraint terms - If someone wants to try removing the other instances of using OrderedDict, that would be helpful to find out which ones are really needed to maintain reproducibility
On Thu, Apr 11, 2024 at 2:20 PM Nicholas Smith @.***> wrote:
Also relevant is #759 (review) https://github.com/cms-analysis/HiggsAnalysis-CombinedLimit/pull/759#pullrequestreview-975858053
— Reply to this email directly, view it on GitHub https://github.com/cms-analysis/HiggsAnalysis-CombinedLimit/pull/936#issuecomment-2049681264, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMEVW7QJX2GQORKN726TL3Y42E3PAVCNFSM6AAAAABGCACQGGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANBZGY4DCMRWGQ . You are receiving this because you authored the thread.Message ID: @.***>
I think that's a good plan, but perhaps more holistic solution is to refactor the parser along the lines of #818
Thanks for checking! Yes I suspected the others were somewhat overkill - it might be useful to have a PR open with those other ones reverted (if you happen to have that locally from having done the tests) and then once we are confident, we can push that in - but I also second Nick's suggestion to re-think the way we Parse the inputs more generally
On Fri, Apr 12, 2024 at 2:18 PM kcormi @.***> wrote:
@.**** commented on this pull request.
In python/ModelTools.py https://github.com/cms-analysis/HiggsAnalysis-CombinedLimit/pull/936#discussion_r1562539971 :
@@ -808,7 +810,7 @@ def doFillNuisPdfsAndSets(self): if p != "constr": nuisVars.add(self.out.var(c_param_name)) setNuisPdf.append(c_param_name)
- setNuisPdf = set(setNuisPdf)
- setNuisPdf = list(dict.fromkeys((setNuisPdf)))
I did a few tests, and it seems that this line (and line 826 below) changing the sets to an ordered type are enough for exact reproducibility of the fit values on the datacards where we saw the issue when recreating the workspace. Though it may be worth understanding this more clearly.
-- On the broader point I agree with your suggestion, @nsmith- https://github.com/nsmith- it would be nice to make those kinds of changes.
— Reply to this email directly, view it on GitHub https://github.com/cms-analysis/HiggsAnalysis-CombinedLimit/pull/936#pullrequestreview-1996985409, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAMEVW6AWVBN2S2JWV4QSVDY47NI5AVCNFSM6AAAAABGCACQGGVHI2DSMVQWIX3LMV43YUDVNRWFEZLROVSXG5CSMV3GSZLXHMYTSOJWHE4DKNBQHE . You are receiving this because you authored the thread.Message ID: <cms-analysis/HiggsAnalysis-CombinedLimit/pull/936/review/1996985409@ github.com>
Main fix needed is in ModelTools
setNuisPdf = set(setNuisPdf)
->setNuisPdf = list(dict.fromkeys((setNuisPdf)))