Open perrotta opened 1 year ago
A new Issue was created by @perrotta Andrea Perrotta.
@Dr15Jones, @perrotta, @dpiparo, @rappoccio, @makortel, @smuzaffar can you please review it and eventually sign/assign? Thanks.
cms-bot commands are listed here
assign reconstruction
New categories assigned: reconstruction
@mandrenguyen,@clacaputo you have been requested to review this Pull request/Issue and eventually sign? Thanks
Hi all, sorry for the slow motion on this; I was otherwise engaged with my qualifying exams. I've now put together what I hope will be a fix to render to the model deterministic, and pushed new torchscript files to my private data branch here. How can I test it to produce the equivalent plot to the one at the top of this issue to check whether the fix was successful?
@ssrothman thank you for jumping in. For the tests, maybe you can ask @kpedro88 how did they were setup, see https://github.com/cms-sw/cmssw/issues/41060#issue-1625593219 Otherwise you can run the same wf 10805.31 that was mentioned in the issue description, i.e.
runTheMatrix.py -l 10805.31 > & out &
Resolved by #42950
It was realized while reviewing https://github.com/cms-data/RecoEgamma-EgammaPhotonProducers/pull/3 that there was some non reproducible result in the userFloats of patPhotons produced in the wf 10805.31, SingleGammaPt35+2018_photonDRN .
This was confirmed with another comparison of the same wf 10805.31 made for PR #40666, which did not touch the photon weights, and therefore should have produced an identical output in two different runs. Even here, the "randomness" seems in some cases rather significant, i.e. a bit larger than a simple numerical fluctuation in the last digi somewhere:
![image](https://user-images.githubusercontent.com/4069749/225331223-91e85c55-0df4-4810-8b3c-06d863d988aa.png)
@kpedro88 commented in https://github.com/cms-data/RecoEgamma-EgammaPhotonProducers/pull/3#issuecomment-1458662905: "We also saw this behavior in our private tests last week and realized that there is some randomness inherent to the network itself. The random behavior has been there all along, but https://github.com/cms-sw/cmssw/pull/40814 may actually have been the first time that comparison tests were run for 10805.31 (since it is not part of the short matrix), so it wasn't noticed before. This PR does correctly restore the original weights, but we need to make some more changes to make the network deterministic (this is a work in progress right now)."
This github issue intends exactly to keep track of that "work in progress" to get rid of such a non deterministic behavior of the network.
@ssrothman