Open makortel opened 3 years ago
To note here that #43439 is removing 11634.911 from the short matrix, after which we would not see these instabilities anymore in PR tests.
Let me know if you think it is preferable to keep it just to have this "constant reminder" of the issue or if it is something that we can leave to IB tests.
Good question. PR tests (including the short matrix) should be about ensuring the PRs behave as expected, and therefore I think using PR tests to stress-test reproducibility is likely not the best way.
If there is no other use for 11634.911 in short matrix (@cms-sw/geometry-l2 could you comment?), I'd be in favor of dropping 11634.911 from the short matrix. Unfortunately IBs themselves don't provide any facilities for inspecting workflow results. @smuzaffar Maybe we should think about something here, at least for select workflows? (not really optimal, but maybe better than (mis)using PR tests?)
Just to note that in the end https://github.com/cms-sw/cmssw/pull/43439 kept 11634.911
Hi @makortel I think this issue is solved, should we close it? Thx.
Do we know how the issue got resolved? Or is it just not occurring anymore?
The workflow in topic is Run-3, right? As DD4hep is run by default in Run-3 workflow (.911 = .0 for Run-3), I think we don't see any instabilities any more. Do I miss some points that we should keep investigating Run-3 DD4hep workflow?
From the history the frequency seems to have been one occurrence every 1-4 months (although I suspect not all L2s report those).
Earlier comments suggest that .911 and .0 are different, by .911 reading the geometry from XML and .0 from the DB.
From the history the frequency seems to have been one occurrence every 1-4 months (although I suspect not all L2s report those).
Earlier comments suggest that .911 and .0 are different, by .911 reading the geometry from XML and .0 from the DB.
Ah, you are right. .911 is XML version, and .912 (which is .0 default now) is DB. Do we need to monitor XML when we use DB? I mean we don't do Run-1, Run-2 XML (DDD) anymore. So, we never know if there is an issue there or not.
We've observed differences in the DD4Hep workflow 11634.911 comparisons in tests of a few PRs that should not affect results of the DD4Hep workflow. This issue is to collect pointers to those comparisons.