Open makortel opened 3 days ago
cms-bot internal usage
A new Issue was created by @makortel.
@Dr15Jones, @antoniovilela, @makortel, @mandrenguyen, @rappoccio, @sextonkennedy, @smuzaffar can you please review it and eventually sign/assign? Thanks.
cms-bot commands are listed here
assign reconstruction, xpog, dqm
New categories assigned: reconstruction,xpog,dqm
@antoniovagnerini,@ftorrresd,@hqucms,@jfernan2,@mandrenguyen,@nothingface0,@rvenditti,@syuvivida,@tjavaid you have been requested to review this Pull request/Issue and eventually sign? Thanks
are you rediscovering that the first piece of code calling to the boost::spirit
brings most of the cost?
- 103 MB via
ObjectSelectorBase<SingleElementCollectionSelector<edm::View<reco::Muon>, ...>
(link)
I believe this corresponds to MuonSelector
https://github.com/cms-sw/cmssw/blob/4635bff7d6bee3aaee115ae16ec4c540ca76d7b0/CommonTools/RecoAlgos/plugins/MuonSelector.cc#L32-L35
The configuration defines 25 of them, of which 14 are not constructed in the C++ (i.e. these modules are not part of any Task/Sequence/Path/EndPath that is associated with the Schedule)
ALCARECOSiPixelCalSingleMuonTightGoodMuons
ALCARECOSiPixelCalSingleMuonTightRelCombIsoMuons
ALCARECOTkAlDiMuonGoodMuons
ALCARECOTkAlDiMuonRelCombIsoMuons
ALCARECOTkAlJpsiMuMuGoodMuons (not constructed)
ALCARECOTkAlMuonIsolatedGoodMuons
ALCARECOTkAlMuonIsolatedRelCombIsoMuons
ALCARECOTkAlUpsilonMuMuGoodMuons (not constructed)
ALCARECOTkAlUpsilonMuMuPAGoodMuons (not constructed)
ALCARECOTkAlUpsilonMuMuRelCombIsoMuons (not constructed)
ALCARECOTkAlZMuMuGoodMuons
ALCARECOTkAlZMuMuPAGoodMuons (not constructed)
ALCARECOTkAlZMuMuRelCombIsoMuons
ZmmgLeadingMuons (not constructed)
ZmmgTrailingMuons (not constructed)
gemDQMStaMuons
gemDQMTightGlbMuons
muonSelection (not constructed)
muonSelectorForEMu (not constructed)
muonSelectorForZMM (not constructed)
muonSelectorForZMMPA (not constructed)
muonsPt10
selectedMuons (not constructed)
selectedMuonsIso (not constructed)
tightMuonsForPbPbZMuSkim (not constructed)
Looking next the actual cut strings in the 11 modules that are constructed
ALCARECOSiPixelCalSingleMuonTightGoodMuons |
isGlobalMuon &isTrackerMuon &numberOfMatches > 1 &globalTrack.hitPattern.numberOfValidMuonHits > 0 &abs(eta) < 2.5 &globalTrack.normalizedChi2 < 20. |
ALCARECOSiPixelCalSingleMuonTightRelCombIsoMuons |
(isolationR03().sumPt + isolationR03().emEt + isolationR03().hadEt)/pt < 0.15 |
ALCARECOTkAlDiMuonGoodMuons |
isGlobalMuon &isTrackerMuon &numberOfMatches > 1 &globalTrack.hitPattern.numberOfValidMuonHits > 0 &abs(eta) < 2.5 &globalTrack.normalizedChi2 < 20. |
ALCARECOTkAlDiMuonRelCombIsoMuons |
(isolationR03().sumPt + isolationR03().emEt + isolationR03().hadEt)/pt < 0.15 |
ALCARECOTkAlMuonIsolatedGoodMuons |
isGlobalMuon &isTrackerMuon &numberOfMatches > 1 &globalTrack.hitPattern.numberOfValidMuonHits > 0 &abs(eta) < 2.5 &globalTrack.normalizedChi2 < 20. |
ALCARECOTkAlMuonIsolatedRelCombIsoMuons |
(isolationR03().sumPt + isolationR03().emEt + isolationR03().hadEt)/pt < 0.15 |
ALCARECOTkAlZMuMuGoodMuons |
isGlobalMuon &isTrackerMuon &numberOfMatches > 1 &globalTrack.hitPattern.numberOfValidMuonHits > 0 &abs(eta) < 2.5 &globalTrack.normalizedChi2 < 20. |
ALCARECOTkAlZMuMuRelCombIsoMuons |
(isolationR03().sumPt + isolationR03().emEt + isolationR03().hadEt)/pt < 0.15 |
gemDQMStaMuons |
isStandAloneMuon&& outerTrack.isNonnull |
gemDQMTightGlbMuons |
isGlobalMuon&& globalTrack.isNonnull&& passed(\'CutBasedIdTight\') |
muonsPt10 |
isGlobalMuon &isTrackerMuon &numberOfMatches > 1 &globalTrack.hitPattern.numberOfValidMuonHits > 0 &abs(eta) < 2.5 &pt > 10 |
There is a lot of repetition: one group of 4 modules and another group of 4 modules have exactly the same cut string.
Would it be feasible to replace these with modules that express the structure of the selection in C++ code instead?
are you rediscovering that the first piece of code calling to the
boost::spirit
brings most of the cost?
Possibly. I had a vague recollection of such "constant cost" of this infrastructure, but it's source wasn't evident from the profile (maybe I just didn't dig deeply enough into Cling's call stack).
I'd guess the StringObjectFunction<l1t::EGamma, false>
in BXVectorSimpleFlatTableProducer<l1t::EGamma>
would be a good candidate for that "constant cost" as it takes 90 MB, while StringCutObjectSelector<l1t::EGamma, false>
in the same module takes 448 B.
There is nevertheless potential for ~200 MB reduction by addressing the "next largest" consumers.
assign alca
New categories assigned: alca
@atpathak,@consuegs,@perrotta you have been requested to review this Pull request/Issue and eventually sign? Thanks
@cms-sw/alca-l2 By the way, of the 8 ALCARECO*
MuonSelector
s listed above,
ALCARECOSiPixelCalSingleMuonTightGoodMuons
, ALCARECOTkAlDiMuonGoodMuons
, ALCARECOTkAlMuonIsolatedGoodMuons
, ALCARECOTkAlZMuMuGoodMuons
have identical configurationALCARECOSiPixelCalSingleMuonTightRelCombIsoMuons
, ALCARECOTkAlDiMuonRelCombIsoMuons
, ALCARECOTkAlMuonIsolatedRelCombIsoMuons
, ALCARECOTkAlZMuMuRelCombIsoMuons
have otherwise the same configuration, except their src
parameter refer to the aforementioned 4 modules (i.e. if they would replaced with a common module, these 4 could be replaced with a common module as well)This duplication has both memory and runtime cost.
- 93 MB via
ObjectSelectorBase<SingleElementCollectionSelector<std::vector<pat::GenericParticle, ...>
(link)
I believe this one corresponds to PATGenericParticleSelector
https://github.com/cms-sw/cmssw/blob/55fd97727507dd74a4020c9264f5f66118409151/PhysicsTools/PatAlgos/plugins/PATObjectSelector.cc#L249-L250
The configuration contains 2 such modules, of 1 is not constructed because it is not in any Task/Sequence/Path/EndPath associated to the Schedule
looseIsoMuonsForPbPbZMuSkim (not constructed)
looseIsoMuonsForZMuSkim
The cut string is same in both (userIsolation(\'pat::TrackIso\')/pt)<0.4
that looks straightforward to be potentially replaced with a C++-coded selection structure.
Although if this is impacted with the "first call to boost::spirit has a constant cost", it might not be that important (but looks simple nevertheless).
assign DPGAnalysis/Skims
type performance-improvements
New categories assigned: pdmv
@AdrianoDee,@kskovpen,@miquork,@sunilUIET you have been requested to review this Pull request/Issue and eventually sign? Thanks
- 90 MB via
BXVectorSimpleFlatTableProducer<l1t::EGamma>
(link)
This is SimpleTriggerL1EGFlatTableProducer
https://github.com/cms-sw/cmssw/blob/55fd97727507dd74a4020c9264f5f66118409151/PhysicsTools/NanoAOD/plugins/SimpleL1TFlatTableProducerPlugins.cc#L4
that uses
cut = cms.string('pt>=10'),
...
expr = cms.string('eta'),
...
expr = cms.string('hwIso()'),
...
expr = cms.string('phi'),
...
expr = cms.string('pt'),
that look like much simpler mapping from strings to member functions than full expression parser could be feasible.
- 26 MB via
SimpleFlatTableProducerBase<reco::BeamSpot, reco::BeamSpot>
(link)
I believe this is SimpleBeamspotFlatTableProducer
https://github.com/cms-sw/cmssw/blob/55fd97727507dd74a4020c9264f5f66118409151/PhysicsTools/NanoAOD/plugins/SimpleFlatTableProducerPlugins.cc#L49
The configuration has 2 modules of this type, of which only 1 is actually used in the job
simpleBeamspotFlatTableProducer (not constructed)
beamSpotTable
The beamSpotTable
accesses
expr = cms.string('sigmaZ()'),
...
expr = cms.string('sigmaZ0Error()'),
...
expr = cms.string('type()'),
...
expr = cms.string('position().z()'),
...
expr = cms.string('z0Error()'),
that also looks like a much simpler mapping from strings to member functions than the full expression parser could be feasible.
From https://github.com/cms-sw/cmssw/issues/46040#issuecomment-2420368633 the cut/expression parser uses 390 MB (constant cost, i.e. gets amortized across threads). The top 10 modules with the largest contributions are
ObjectSelectorBase<SingleElementCollectionSelector<edm::View<reco::Muon>, ...>
(link)ObjectSelectorBase<SingleElementCollectionSelector<std::vector<pat::GenericParticle, ...>
(link)BXVectorSimpleFlatTableProducer<l1t::EGamma>
(link)SimpleFlatTableProducerBase<reco::BeamSpot, reco::BeamSpot>
(link)TopSingleLeptonDQM
(linkTopMonitor
(link)VersionedIdProducer<edm::Ptr<reco::Photon>
(link)RecoTauPiZeroProducer
(link)SimpleFlatTableProducer<pat::Jet>
(link)SimpleFlatTableProducer<pat::Electron>
(link)Addressing already the top 3 would save about 286 MB.