cms-sw / cmssw

CMS Offline Software
http://cms-sw.github.io/
Apache License 2.0
1.09k stars 4.33k forks source link

Problem in dumping Fireworks Geometry #43097

Open waredjeb opened 1 year ago

waredjeb commented 1 year ago

I am having some problems in dumping a new Fireworks Geometry.

Release = CMSSW_13_3_0_pre3

Command: cmsRun dumpRecoGeometry_cfg.py tag=2026 version=D98 Error:

[wredjeb@patatrack02 python]$ cmsRun dumpRecoGeometry_cfg.py tag=2026 version=D98
Loading configuration for tag  2026 ...

Dumping geometry in  cmsRecoGeom-2026.root 

Begin processing the 1st record. Run 1, Event 1, LumiSection 1 on stream 0 at 24-Oct-2023 10:23:22.093 CEST
----- Begin Fatal Exception 24-Oct-2023 10:24:23 CEST-----------------------
An exception of category 'GeometryMismatch' occurred while
   [0] Processing  Event run: 1 lumi: 1 event: 1 stream: 0
   [1] Running path 'p'
   [2] Prefetching for module DumpFWRecoGeometry/'dump'
   [3] Prefetching for EventSetup module FWRecoGeometryESProducer/''
   [4] Prefetching for EventSetup module GlobalTrackingGeometryESProducer/''
   [5] Calling method for EventSetup module TrackerDigiGeometryESModule/'trackerGeometry'
Exception Message:
Size mismatch between geometry (size=43492) and alignments (size=19876)
----- End Fatal Exception -------------------------------------------------

I am seeing the same thing in CMSSW_13_2_0_pre3

cmsbuild commented 1 year ago

A new Issue was created by @waredjeb Wahid Redjeb.

@Dr15Jones, @rappoccio, @sextonkennedy, @smuzaffar, @makortel, @antoniovilela can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

felicepantaleo commented 1 year ago

@bsunanda fyi

mmusich commented 1 year ago

Certainly the GlobalTag here

https://github.com/cms-sw/cmssw/blob/bca672bc5e1a954fbd1aaa8ffdabafb64a51463a/Fireworks/Geometry/python/dumpRecoGeometry_cfg.py#L78-L83

does not look right.

makortel commented 1 year ago

assign geometry, visualization

cmsbuild commented 1 year ago

New categories assigned: geometry,visualization

@Dr15Jones,@Dr15Jones,@alja,@civanch,@bsunanda,@makortel,@makortel,@mdhildreth you have been requested to review this Pull request/Issue and eventually sign? Thanks

antoniovilela commented 1 year ago

Marco's comment, to be followed up: https://github.com/cms-sw/cmssw/pull/43098#issuecomment-1777091606

waredjeb commented 5 months ago

Hello! I am reviving this issue because I am still having some problem in dumping the geometry.

Release = CMSSW_14_1_0_pre4 cmsRun dumpRecoGeometry_cfg.py tag=2026 version=D110

[wredjeb@olivb-27 python]$ cmsRun dumpRecoGeometry_cfg.py tag=2026 version=D110                                                                              
Loading configuration for tag  2026 ...

Dumping geometry in  cmsRecoGeom-2026.root 

Begin processing the 1st record. Run 1, Event 1, LumiSection 1 on stream 0 at 20-Jun-2024 09:45:40.738 CEST
----- Begin Fatal Exception 20-Jun-2024 09:46:08 CEST-----------------------
An exception of category 'GeometryMismatch' occurred while
   [0] Processing  Event run: 1 lumi: 1 event: 1 stream: 0
   [1] Running path 'p'
   [2] Prefetching for module DumpFWRecoGeometry/'dump'
   [3] Prefetching for EventSetup module FWRecoGeometryESProducer/''
   [4] Prefetching for EventSetup module GlobalTrackingGeometryESProducer/''
   [5] Calling method for EventSetup module TrackerDigiGeometryESModule/'trackerGeometry'
Exception Message:
Size mismatch between geometry (size=43708) and alignments (size=43492)
----- End Fatal Exception -------------------------------------------------

This time, the GlobalTag looks good to me! The problem seems to come from the Tracker, nevertheless disabling Tracker (and also MTD) doesn't help.

@rovere FYI

cmsbuild commented 5 months ago

cms-bot internal usage

mmusich commented 5 months ago

@waredjeb Hi Wahid,

This time, the GlobalTag looks good to me!

unfortunately it's not.

D110 = T35+C18+M11+I17+O9+F8

so you need auto:phase2_realitisc_T33 (don't ask ... T35 ~ T33) because the alignment in auto:phase2_realistic is still the one for T25:

$ getAutoCond -k phase2_realistic
theRelease: CMSSW_14_1_0_pre4
phase2_realistic : 140X_mcRun4_realistic_v3

and

$ conddb list 140X_mcRun4_realistic_v3 | grep TrackerAlignment
[2024-06-20 10:20:55,856] INFO: Connecting to pro [frontier://PromptProd/cms_conditions]
TrackerAlignmentErrorExtendedRcd                        -                                                                  TrackerAlignmentErrorsExtended_Upgrade2026_T25_design_v0                    
TrackerAlignmentRcd                                     -                                                                  TrackerAlignment_Upgrade2026_T25_design_v0           

@cms-sw/alca-l2 @cms-sw/upgrade-l2 FYI.

waredjeb commented 5 months ago

@waredjeb Hi Wahid,

This time, the GlobalTag looks good to me!

unfortunately it's not.

D110 = T35+C18+M11+I17+O9+F8

so you need auto:phase2_realitisc_T33 (don't ask ... T35 ~ T33) because the alignment in auto:phase2_realistic is still the one for T25:

$ getAutoCond -k phase2_realistic
theRelease: CMSSW_14_1_0_pre4
phase2_realistic : 140X_mcRun4_realistic_v3

and

$ conddb list 140X_mcRun4_realistic_v3 | grep TrackerAlignment
[2024-06-20 10:20:55,856] INFO: Connecting to pro [frontier://PromptProd/cms_conditions]
TrackerAlignmentErrorExtendedRcd                        -                                                                  TrackerAlignmentErrorsExtended_Upgrade2026_T25_design_v0                    
TrackerAlignmentRcd                                     -                                                                  TrackerAlignment_Upgrade2026_T25_design_v0           

@cms-sw/alca-l2 @cms-sw/upgrade-l2 FYI.

Hi Marco,

I see, thank you very much! By replacing: https://github.com/cms-sw/cmssw/blob/master/Fireworks/Geometry/python/dumpRecoGeometry_cfg.py#L81-L82 With

from Configuration.AlCa.GlobalTag import GlobalTag
process.GlobalTag = GlobalTag(process.GlobalTag, 'auto:phase2_realistic_T33', '')

Seems to work, not sure if this Exception Message is concerning:

[wredjeb@olivb-27 python]$ cmsRun dumpRecoGeometry_cfg.py tag=2026 version=D110                                                                                                                                                                             
Loading configuration for tag  2026 ...

Dumping geometry in  cmsRecoGeom-2026.root 

Begin processing the 1st record. Run 1, Event 1, LumiSection 1 on stream 0 at 20-Jun-2024 10:59:50.808 CEST
%MSG-e FWRecoGeometry:   FWRecoGeometryESProducer:FWRecoGeometryESProducer@callESModule  20-Jun-2024 11:00:18 CEST Run: 1 Event: 1
 ME0 geometry not found An exception of category 'NoGeometry' occurred.
Exception Message:
No Tracking Geometry is available for DetId 704643072

%MSG
mmusich commented 5 months ago

Seems to work, not sure if this Exception Message is concerning:

I don't know either, but seems to concern ME0 so the Muon system (not sure who needs to be tagged about it).

waredjeb commented 5 months ago

Ok, including the Tracker raises another problem:

[wredjeb@olivb-27 python]$ cmsRun dumpRecoGeometry_cfg.py tag=2026 version=D110  tracker=True                                       
Loading configuration for tag  2026 ...

Dumping geometry in  cmsRecoGeom-2026.root 

Begin processing the 1st record. Run 1, Event 1, LumiSection 1 on stream 0 at 20-Jun-2024 11:13:22.083 CEST
----- Begin Fatal Exception 20-Jun-2024 11:13:49 CEST-----------------------
An exception of category 'WrongTrackerSubDet' occurred while
   [0] Processing  Event run: 1 lumi: 1 event: 1 stream: 0
   [1] Running path 'p'
   [2] Prefetching for module DumpFWRecoGeometry/'dump'
   [3] Calling method for EventSetup module FWRecoGeometryESProducer/''
Exception Message:
Invalid DetID: no GeomDetUnit associated with raw ID 303042564 of subdet ID 1
----- End Fatal Exception -------------------------------------------------
mmusich commented 5 months ago
Invalid DetID: no GeomDetUnit associated with raw ID 303042564 of subdet ID 1

this looks more concerning and might have to do with the sensor splitting in TBPix layer 1 (subdet ID 1). @emiglior @cms-sw/trk-dpg-l2

perrotta commented 5 months ago

Curious, the GT for the T33 tracker geometry was ready since some while, and I was convinced it had been already ported to autoCond. Evidently, it was not: thank you @mmusich for noticing it.

Now PR https://github.com/cms-sw/cmssw/pull/45274 updates it in autoCond

mmusich commented 5 months ago

so this:

diff --git a/Fireworks/Geometry/python/dumpRecoGeometry_cfg.py b/Fireworks/Geometry/python/dumpRecoGeometry_cfg.py
index 73c80aa320d..f975d372dd5 100644
--- a/Fireworks/Geometry/python/dumpRecoGeometry_cfg.py
+++ b/Fireworks/Geometry/python/dumpRecoGeometry_cfg.py
@@ -43,7 +43,7 @@ def versionCheck(ver):
       print("")
       help()

-def recoGeoLoad(score):
+def recoGeoLoad(score, properties):
     print("Loading configuration for tag ", options.tag ,"...\n")

     if score == "Run1":
@@ -78,8 +78,29 @@ def recoGeoLoad(score):
     elif "2026" in score:
        versionCheck(options.version)
        process.load("Configuration.StandardSequences.FrontierConditions_GlobalTag_cff")
-       from Configuration.AlCa.autoCond import autoCond
-       process.GlobalTag.globaltag = autoCond['phase2_realistic']
+
+       # Import the required configuration from the CMSSW module
+       from Configuration.AlCa.autoCond import autoCond  # Ensure autoCond is imported
+
+       # Ensure options.version is defined and set correctly
+       version_key = '2026' + options.version  # This constructs the key for accessing the properties dictionary
+       print(f"Constructed version key: {version_key}")
+
+       # Check if the key exists in properties for 2026
+       if version_key in properties[2026]:
+          # Get the specific global tag for this version
+          global_tag_key = properties[2026][version_key]['GT']
+          print(f"Global tag key from properties: {global_tag_key}")
+
+          # Check if this key exists in autoCond
+          if global_tag_key.replace("auto:", "") in autoCond:
+             # Set the global tag
+             from Configuration.AlCa.GlobalTag import GlobalTag
+             process.GlobalTag = GlobalTag(process.GlobalTag, global_tag_key, '')
+          else:
+             raise KeyError(f"Global tag key '{global_tag_key}' not found in autoCond.")
+       else:
+          raise KeyError(f"Version key '{version_key}' not found in properties[2026].")
        process.load('Configuration.Geometry.GeometryExtended2026'+options.version+'Reco_cff')

     elif score == "MaPSA":
@@ -114,11 +135,8 @@ def recoGeoLoad(score):
       help()

-
-
 options = VarParsing.VarParsing ()

-
 defaultOutputFileName="cmsRecoGeom.root"

 options.register ('tag',
@@ -171,16 +189,33 @@ options.register ('out',

 options.parseArguments()

-
-
-
-process = cms.Process("DUMP")
+from Configuration.PyReleaseValidation.upgradeWorkflowComponents import upgradeProperties as properties
+# Determine version_key based on the value of options.tag
+if options.tag == "2026" or options.tag == "MaPSA":
+   prop_key = 2026
+   version_key = options.tag + options.version
+elif options.tag == "2017" or options.tag == "2021":
+   prop_key = 2017
+   version_key = options.tag
+else:
+   prop_key = None
+   version_key = None
+
+if(prop_key and version_key):
+   print(f"Constructed version key: {version_key}")
+   era_key = properties[prop_key][str(version_key)]['Era']
+   print(f"Constructed era key: {era_key}")
+   from Configuration.StandardSequences.Eras import eras
+   era = getattr(eras, era_key)
+   process = cms.Process("DUMP",era)
+else:
+   process = cms.Process("DUMP")
 process.add_(cms.Service("InitRootHandlers", ResetRootErrHandler = cms.untracked.bool(False)))
 process.source = cms.Source("EmptySource")
 process.maxEvents = cms.untracked.PSet(input = cms.untracked.int32(1))

-recoGeoLoad(options.tag)
+recoGeoLoad(options.tag,properties)

 if ( options.tgeo == True):
     if (options.out == defaultOutputFileName )

allows for a slightly better automatic treatement of GlobalTags and eras in the configuration. Unfortunately it does not solve neither the warnings nor the crash for Tracker in D110.

mmusich commented 5 months ago

Unfortunately it does not solve neither the warnings nor the crash for Tracker in D110.

concerning the crash in the Tracker part, it looks it's due this line:

https://github.com/cms-sw/cmssw/blob/9030bf6e5bc33e617ca03ac40e8586ea8b86abc2/Fireworks/Geometry/src/FWRecoGeometryESProducer.cc#L411

in particular at the very beginning of the DetId loop in BPix (presumably Layer1 but I didn't explicitly cross-check), this call

m_trackerGeom->idToDetUnit(detid)

fails on detid 303042564 while it works on the "other" companion detid 303042565, so it looks like the detUnit mapping is not working correctly in this setup.

felicepantaleo commented 5 months ago

hi, is there any way this dumper can be run as a test automatically for every PR? Or at least for any PR that introduces a new version of the geometry?

mmusich commented 5 months ago

hi, is there any way this dumper can be run as a test automatically for every PR? Or at least for any PR that introduces a new version of the geometry?

sure there is, this is what I was suggesting: https://github.com/cms-sw/cmssw/pull/43098#issuecomment-1777091606

alja commented 5 months ago

@makortel Do I need to open an issue to add unit tests?

( PS: I'm on vacation starting tomorrow until Tuesday)

makortel commented 5 months ago

@makortel Do I need to open an issue to add unit tests?

( PS: I'm on vacation starting tomorrow until Tuesday)

I'm not sure I understand your question. Unit tests can be added in just a PR. Having a unit test run in PR tests of an unrelated package (*) is generally unresolved problem (for which we might already have an issue open). In any case @smuzaffar should be able to help further.

(*) I didn't check if the package defining the geometries, and the straightforward place for the unit test are related in any way or not (being on vacation myself)

smuzaffar commented 5 months ago

@alja , as @makortel wrote, you can open the PR to add the unit test in one of the existing packages. About running the new unit test for all PR might be over killed. Is there any way to find if a PR is changing geometry? If yes then we can teach bot to checkout the package which contains this unit test Or we can also teach bot to check out the package container new unit tests if PR requires geometry approval ( this will allow the new unit test to run for such PRs)

alja commented 5 months ago

@smuzaffar I don't know how the PR in some other package can trigger unit tests in Fireworks/Geometry. Now I'm not even sure if reco geometry and sim geometry dumper are supposed to be in Fireworks module. I also don't know if more cmssw modules can cause the crashing of Fireworks reco geometry dumper. Should I post the question at cms mattermost?

felicepantaleo commented 5 months ago

As workaround, we are removing the exception here: https://github.com/cms-sw/cmssw/blob/64d4f3b087ce5832f186bc811757c17e16fb9ddd/Geometry/TrackerGeometryBuilder/src/TrackerGeometry.cc#L183-L191

and skipping the tracker detIds that give nullptr. We will keep this workaround private until @cms-sw/trk-dpg-l2 fixes the map.

mmusich commented 5 months ago

sure there is, this is what I was suggesting: https://github.com/cms-sw/cmssw/pull/43098#issuecomment-1777091606

so I packaged two (would-be) improvements in https://github.com/cms-sw/cmssw/pull/45328:

mmusich commented 3 months ago

type trk

waredjeb commented 3 months ago

Hello! Any news on this?

felicepantaleo commented 3 months ago

@cms-sw/trk-dpg-l2 @alja @bsunanda @cms-sw/upgrade-l2 The workaround we had planned to use does not work. We managed to dump the reco geometry in pre1 for D99 and use it in fireworks in pre6 (the only change wrt D110 is the tracker geometry version). We are observing big problems in calogeometry and we would urgently need fireworks to work again for D110 and other geometries to debug, develop and validate.

Dr15Jones commented 3 months ago

@felicepantaleo can you post exactly how to reproduce the problem? I.e. what the command you issued?

waredjeb commented 3 months ago

Hi @Dr15Jones With CMSSW_14_1_0_pre6

[wredjeb@olhsw-21 python]$ cmsRun dumpRecoGeometry_cfg.py tag=2026 version=D110
Loading configuration for tag  2026 ...

Dumping geometry in  cmsRecoGeom-2026.root 

Begin processing the 1st record. Run 1, Event 1, LumiSection 1 on stream 0 at 23-Aug-2024 15:35:52.597 CEST
----- Begin Fatal Exception 23-Aug-2024 15:36:10 CEST-----------------------
An exception of category 'WrongTrackerSubDet' occurred while
   [0] Processing  Event run: 1 lumi: 1 event: 1 stream: 0
   [1] Running path 'p'
   [2] Prefetching for module DumpFWRecoGeometry/'dump'
   [3] Calling method for EventSetup module FWRecoGeometryESProducer/''
Exception Message:
Invalid DetID: no GeomDetUnit associated with raw ID 303042564 of subdet ID 1
----- End Fatal Exception -------------------------------------------------
[wredjeb@olhsw-21 python]$ 

And by replacing: https://github.com/cms-sw/cmssw/blob/master/Fireworks/Geometry/python/dumpRecoGeometry_cfg.py#L81-L82 With

from Configuration.AlCa.GlobalTag import GlobalTag
process.GlobalTag = GlobalTag(process.GlobalTag, 'auto:phase2_realistic_T33', '')

, as suggested in https://github.com/cms-sw/cmssw/issues/43097#issuecomment-2180100033

I get

[wredjeb@olhsw-21 python]$ cmsRun dumpRecoGeometry_cfg.py tag=2026 version=D110
Loading configuration for tag  2026 ...

Dumping geometry in  cmsRecoGeom-2026.root 

Begin processing the 1st record. Run 1, Event 1, LumiSection 1 on stream 0 at 23-Aug-2024 15:37:49.844 CEST
----- Begin Fatal Exception 23-Aug-2024 15:38:08 CEST-----------------------
An exception of category 'WrongTrackerSubDet' occurred while
   [0] Processing  Event run: 1 lumi: 1 event: 1 stream: 0
   [1] Running path 'p'
   [2] Prefetching for module DumpFWRecoGeometry/'dump'
   [3] Calling method for EventSetup module FWRecoGeometryESProducer/''
Exception Message:
Invalid DetID: no GeomDetUnit associated with raw ID 303042564 of subdet ID 1
----- End Fatal Exception -------------------------------------------------
Dr15Jones commented 3 months ago

In principal, I see what caused the problem. The ID 303042564 is added to TrackerGeometry via a call to addDet, not via addDetUnit. The code in FWRecoGeometryESProducer loops over detsPXB

https://github.com/cms-sw/cmssw/blob/dfc9a3cd0e7340aff5dfa94920edd6f738e8c5c5/Fireworks/Geometry/src/FWRecoGeometryESProducer.cc#L399-L400

but then expects those to be available in idToDetUnit

https://github.com/cms-sw/cmssw/blob/dfc9a3cd0e7340aff5dfa94920edd6f738e8c5c5/Fireworks/Geometry/src/FWRecoGeometryESProducer.cc#L411

but the only guarantee is those items are in idToDet.

So the question I have is what is the difference between a Det and a DetUnit ? The documentation of the code provides no explanation.

Also, is id 303042564 expected to be a DetUnit ?

Dr15Jones commented 3 months ago

So it looks like address 303042564 is related to a 'composed' detector component which associates two GeomDetUnits which is why it has no geomDet unit id. See

https://github.com/cms-sw/cmssw/blob/dfc9a3cd0e7340aff5dfa94920edd6f738e8c5c5/Geometry/TrackerGeometryBuilder/src/TrackerGeomBuilderFromGeometricDet.cc#L301-L322

mmusich commented 3 months ago

So the question I have is what is the difference between a Det and a DetUnit ? The documentation of the code provides no explanation.

I am no geometry expert, but from the Tracker point of view a Det should be the representation of a whole module (so it will matter e.g. for powering / readout), while the DetUnits are the individual (mechanically separate) sensors living within the same module (so it will matter e.g. for spatial alignment). If I understand correctly the problem is that in geometry D110 (Tracker T35, inheriting the geometrical hierarchy from T33) we have a new layout for the 3D modules in TPBX, layer 1, which consist of two separate sensors with a gap between them. This was introduced in https://github.com/cms-sw/cmssw/pull/41880. Thus I would expect that for this type of modules we have 1 Det and two associated DetUnits. Notice that this is the same kind of logic that we have currently in the Phase-1 SiStrip Detector in the outer barrel.

mmusich commented 3 months ago

Let me notice also, that this issue seems to be linked to another issue I spotted at https://github.com/cms-sw/cmssw/pull/45764#discussion_r1724853973, when trying to migrate the tracker misalignment tools to use this geometry D110. The AlignableTrackerBuilder complains that it finds in the hierarchy a Pixel GeomDet which is not also a GeomDetUnit (and thus an "alignable"). Something looks off in the way in which the list of GeomDets is built for this particular tracker layout.

Dr15Jones commented 3 months ago

See #45784 for my attempt to fix the issue. The code changes skips over the composing detector components.

osschar commented 3 months ago

We used the Fireworks dumper as a base for MkFit geometry dumper ... but there we ended up getting the tracker geometry from the TrackerDigiGeometryRecord -- and this one never seemed to have this problem. I was trying to look into this problem twice over the last two months (as HGCal folks told me this blocks their usage of D110 geometry in Fireworks) but the complexity of the setup and all the dependencies made it too hard for me to come to any meaningful conclusions.

The double/glued modules in Phase1 have separate detids ... as the space-point hits made of the two strips of the glued modules use it. One tests this through TrackerTopology on det-subid, through a sub-det specific if-else-if: https://github.com/cms-sw/cmssw/blob/master/RecoTracker/MkFit/plugins/MkFitGeometryESProducer.cc#L198-L208

In mkFit we drop those, as we always work with strips only.

In Fireworks we might (I'm not sure if we actually do) get products that reference those double-sided space-points.

I still do not quite know what to think about it all ... I hope I'm not just confusing everyone :)

osschar commented 3 months ago

And thank you all for diving into this!

alja commented 3 months ago

@Dr15Jones

Thank you for debugging and correcting!

The #45784 has the correction only for pixel barrel. I don't understand why the 'leaf' restriction is only for this detector? Is it possible there is a bug in the pixel barrel geometry definition?

mmusich commented 2 months ago

https://github.com/cms-sw/cmssw/pull/45993 provides a better fix.

srimanob commented 2 months ago

Thanks @mmusich for pointing to PR. Do we still need protection in https://github.com/cms-sw/cmssw/pull/45784 or we should revert? So that when we have an issue, we will know?

Thx.

mmusich commented 2 months ago

we should revert? So that when we have an issue, we will know

it is already reverted in https://github.com/cms-sw/cmssw/pull/45993

srimanob commented 1 month ago

Can this issue be closed?