cms-sw / cmssw

CMS Offline Software
http://cms-sw.github.io/
Apache License 2.0
1.07k stars 4.29k forks source link

Deployment of likelihood fits in DQMGenericClient etc #43722

Open makortel opened 8 months ago

makortel commented 8 months ago

This issue is to track the deployment of likelihood fits and Minuit2 as a followup to the discussion at the end of https://github.com/cms-sw/cmssw/pull/43106 .

Short history: ROOT 6.30 switched the default minimizer from Minuit to Minuit2. With the chi2 fits in e.g. DQMGenericClient this change resulted in VariableMetricBuilder Initial matrix not pos.def. exceptions that were reported in https://github.com/cms-sw/cmssw/issues/42979). The suggested fix was to change the fits to likelihood instead of chi2. This change was done in https://github.com/cms-sw/cmssw/pull/43106 for DQMGenericClient, that resulted later physics validation failures as reported in https://github.com/cms-sw/cmssw/pull/43106#issuecomment-1883349796 .

The suggestion is that in parallel of the workarounds to continue using ROOT 6.30 (discussed in https://github.com/cms-sw/cmssw/pull/43106#issuecomment-1894196153 onwards)

I suppose each step would require some level of validation.

makortel commented 8 months ago

assign pdmv, dqm

cmsbuild commented 8 months ago

New categories assigned: pdmv,dqm

@AdrianoDee,@sunilUIET,@miquork,@rvenditti,@syuvivida,@tjavaid,@nothingface0,@antoniovagnerini you have been requested to review this Pull request/Issue and eventually sign? Thanks

cmsbuild commented 8 months ago

cms-bot internal usage

cmsbuild commented 8 months ago

A new Issue was created by @makortel Matti Kortelainen.

@makortel, @Dr15Jones, @antoniovilela, @rappoccio, @sextonkennedy, @smuzaffar can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

guitargeek commented 8 months ago

Note that the error with fit about the Hesse Matrix not being positive definite was already a problem with the old Minuit (as I figures out by re-fitting the DQM histograms standalone). It's just that they got not printed via the ROOT Error logging, which is converted to exceptions by CMSSW.

In my opinion, the appropriate hotfix would therefore be what @Dr15Jones suggested here, ignoring such exceptions: https://github.com/cms-sw/cmssw/issues/42979#issuecomment-1757905502

makortel commented 8 months ago

If printing out the error should be the only difference between Minuit and Minuit2, then I agree that turning the ROOT error message to a printout instead of an exception would be better than going back to Minuit. I believe we should still revert the fit method from likelihood to chi2 until the fit ranges have been adjusted.

Another question is then if we should leave the VariableMetricBuilder Initial matrix not pos.def. error message as non-fatal in the long term as well? If the cause of the message is "just" bad input, is that good-enough reason to terminate the job?

makortel commented 8 months ago

In my opinion, the appropriate hotfix would therefore be what @Dr15Jones suggested here, ignoring such exceptions: #42979 (comment)

Anyway, this is now done in https://github.com/cms-sw/cmssw/pull/43726

smuzaffar commented 8 months ago

After https://github.com/cms-sw/cmssw/pull/43726, I guess we should revert

makortel commented 8 months ago

I guess we should revert

@smuzaffar Are you planning to revert these PRs? Or what is the plan? (sorry I missed ORP)

smuzaffar commented 8 months ago

@makortel , there was no discussion about these during ORP but I am in favor of reverting these. I am opening a revert PR now

smuzaffar commented 8 months ago

I just have opened https://github.com/cms-sw/cmssw/pull/43781 for alca and https://github.com/cms-sw/cmssw/pull/43782 for dqm

makortel commented 8 months ago

there was no discussion about these during ORP

Ok... I tried to make a comment of these with the google form. @cms-sw/orp-l2 please take note, these reverts are needed to restore the DQMGenericClient (et al) fits to behave as before 13_3_0_pre5 (where validation failures were reported), and in my opinion should be included in the pre3.

but I am in favoid of reverting these. I am opening a revert PR now

Thanks!

makortel commented 8 months ago

assign alca

Because of https://github.com/cms-sw/cmssw/pull/43781 / https://github.com/cms-sw/cmssw/pull/43588

cmsbuild commented 8 months ago

New categories assigned: alca

@saumyaphor4252,@perrotta,@consuegs you have been requested to review this Pull request/Issue and eventually sign? Thanks