Closed mmusich closed 2 years ago
assign dqm
New categories assigned: dqm
@jfernan2,@ahmad3213,@micsucmed,@rvenditti,@emanueleusai,@pmandrik you have been requested to review this Pull request/Issue and eventually sign? Thanks
A new Issue was created by @mmusich Marco Musich.
@Dr15Jones, @perrotta, @dpiparo, @rappoccio, @makortel, @smuzaffar, @qliphy can you please review it and eventually sign/assign? Thanks.
cms-bot commands are listed here
@cms-sw/alca-l2 @cms-sw/hlt-l2 I am copying you as the more substantive recent changes are on your area.
Actually also https://github.com/cms-sw/cmssw/pull/38917 could be the culprit: @Dr15Jones
The log of 11634.911 step 2 has many
%MSG-w L1GtUtils: L1TRawToDigi:gtStage2Digis@ctor 09-Aug-2022 14:22:40 CEST pre-events
Found preferred tag InputTag: label = gtStage2Digis, instance = , process = RECO
after having set unpreferred tag (InputTag: label = valGtStage2Digis, instance = , process = RECO) for GlobalAlgBlkBxCollection.
Please change configuration to explicitly use the tag given above.
This will avoid unnecessary prefetching of data not used.
%MSG
and
%MSG-e L1GtUtils: AfterSourceConstruction 09-Aug-2022 14:23:13 CEST pre-events
Found multiple preferred input tags for GlobalAlgBlkBxCollection product,
with different instaces or processes.
Tag already found: InputTag: label = gtStage2Digis, instance = , process = RECO
Other tag: InputTag: label = hltGtStage2Digis, instance = , process = HLT
Token set to invalid.
%MSG
(GlobalAlgBlkBxCollection
is a type alias for BXVector<GlobalAlgBlk>
). Unfortunately these messages don't tell anything about the modules that are using L1TGlobalUtilsHelper
(we are thinking ways to improve the message decorations or something for that).
It seems to me that there are many modules that have been configured incorrect for a long time, and now https://github.com/cms-sw/cmssw/pull/38917 (and https://github.com/cms-sw/cmssw/pull/38900) reveal those, BPHMonitor
likely being one of them.
Adding also @cms-sw/l1-l2
The warning about Found multiple preferred input tags
is most likely the origin of the problem. I took another look at the code before my change. Previously, when multiple preferred input tags
were found the code would issue a LogDebug message, unset the edm::InputTag
it has previously set but NOT reset the EDGetTokenT
it had set. So the code would get which ever data it happened to be handed first, which is NOT a good behavior.
So I changed the Error message to an exception and added context about the module and now I see
----- Begin Fatal Exception 09-Aug-2022 14:42:17 CDT-----------------------
An exception of category 'L1GtUtils::TooManyChoices' occurred while
[0] Constructing the EventProcessor
[1] Running 'callWhenNewProductRegistered' for module Dimuon25_Jpsi_tnp
Exception Message:
Found multiple preferred input tags for GlobalAlgBlkBxCollection product,
with different instaces or processes.
Tag already found: InputTag: label = gtStage2Digis, instance = , process = RECO
Other tag: InputTag: label = hltGtStage2Digis, instance = , process = HLT
----- End Fatal Exception -------------------------------------------------
Looking at the configuration I find
>>> print(process.Dimuon25_Jpsi_tnp.dumpPython())
cms.EDProducer("BPHMonitor",
DMSelection_ref = cms.string('Pt>4 & abs(eta)'),
FolderName = cms.string('HLT/BPH/DiMu25_Jpsi_noCorr/'),
Jpsi = cms.int32(0),
L3 = cms.int32(0),
Upsilon = cms.int32(0),
beamSpot = cms.InputTag("offlineBeamSpot"),
denGenericTriggerEventPSet = cms.PSet(
ReadPrescalesFromFile = cms.bool(True),
andOr = cms.bool(False),
andOrDcs = cms.bool(False),
andOrHlt = cms.bool(True),
andOrL1 = cms.bool(False),
dbLabel = cms.string(''),
dcsInputTag = cms.InputTag("scalersRawToDigi"),
dcsPartitions = cms.vint32(),
dcsRecordInputTag = cms.InputTag("onlineMetaDataDigis"),
errorReplyDcs = cms.bool(True),
errorReplyHlt = cms.bool(False),
errorReplyL1 = cms.bool(False),
hltDBKey = cms.string(''),
hltInputTag = cms.InputTag("TriggerResults","","HLT"),
hltPaths = cms.vstring('HLT_Mu7p5_Track2_Jpsi_v*'),
l1Algorithms = cms.vstring(),
l1BeforeMask = cms.bool(False),
l1tAlgBlkInputTag = cms.InputTag("gtStage2Digis"),
l1tExtBlkInputTag = cms.InputTag("gtStage2Digis"),
stage2 = cms.bool(True),
verbosityLevel = cms.uint32(0)
),
displaced = cms.int32(0),
enum = cms.int32(1),
histoPSet = cms.PSet(
BmassPSet = cms.PSet(
nbins = cms.uint32(20),
xmax = cms.double(5.5),
xmin = cms.double(5.1)
),
cosPSet = cms.PSet(
nbins = cms.uint32(10),
xmax = cms.double(1),
xmin = cms.double(0.9)
),
d0PSet = cms.PSet(
nbins = cms.uint32(50),
xmax = cms.double(5),
xmin = cms.double(-5.0)
),
dMuPtBinning = cms.vdouble(
6, 8, 12, 16, 20,
25, 30, 35, 40, 50,
70
),
dRPSet = cms.PSet(
nbins = cms.uint32(26),
xmax = cms.double(1.3),
xmin = cms.double(0)
),
dcaPSet = cms.PSet(
nbins = cms.uint32(10),
xmax = cms.double(0.5),
xmin = cms.double(0)
),
dsPSet = cms.PSet(
nbins = cms.uint32(15),
xmax = cms.double(60),
xmin = cms.double(0)
),
etaPSet = cms.PSet(
nbins = cms.uint32(12),
xmax = cms.double(2.4),
xmin = cms.double(-2.4)
),
massPSet = cms.PSet(
nbins = cms.uint32(140),
xmax = cms.double(7.0),
xmin = cms.double(0)
),
phiPSet = cms.PSet(
nbins = cms.uint32(8),
xmax = cms.double(3.2),
xmin = cms.double(-3.2)
),
probBinning = cms.vdouble(
0.01, 0.02, 0.04, 0.06, 0.08,
0.1, 0.2, 0.3, 0.4, 0.5,
0.6, 0.7, 0.8, 0.9, 1.0
),
ptBinning = cms.vdouble(
-0.5, 0, 2, 4, 8,
10, 12, 16, 20, 25,
30, 35, 40, 50
),
z0PSet = cms.PSet(
nbins = cms.uint32(60),
xmax = cms.double(15),
xmin = cms.double(-15)
)
),
hltTriggerSummaryAOD = cms.InputTag("hltTriggerSummaryAOD","","HLT"),
kaon_mass = cms.double(0.493677),
max_dR = cms.double(1.4),
maxmass = cms.double(3.596),
maxmassJpsi = cms.double(3.2),
maxmassJpsiTk = cms.double(5.46),
maxmassTkTk = cms.double(10),
maxmassUpsilon = cms.double(10),
mightGet = cms.optional.untracked.vstring,
minDS = cms.double(3),
min_dR = cms.double(0.001),
mincos = cms.double(0.95),
minmass = cms.double(2.596),
minmassJpsi = cms.double(3),
minmassJpsiTk = cms.double(5.1),
minmassTkTk = cms.double(0),
minmassUpsilon = cms.double(8.8),
minprob = cms.double(0.005),
mu_mass = cms.double(0.1056583745),
muoSelection = cms.string('abs(eta)<2.4 & isPFMuon & isGlobalMuon & innerTrack.hitPattern.trackerLayersWithMeasurement>5 & innerTrack.hitPattern.numberOfValidPixelHits>0 '),
muoSelection_probe = cms.string('isPFMuon & isGlobalMuon & innerTrack.hitPattern.trackerLayersWithMeasurement>5 & innerTrack.hitPattern.numberOfValidPixelHits> 0'),
muoSelection_ref = cms.string('pt>7.5 && abs(eta)<2.4 & isPFMuon & isGlobalMuon & innerTrack.hitPattern.trackerLayersWithMeasurement>5 & innerTrack.hitPattern.numberOfValidPixelHits>0 '),
muoSelection_tag = cms.string('isGlobalMuon && isPFMuon && isTrackerMuon && abs(eta) < 2.4 && innerTrack.hitPattern.numberOfValidPixelHits > 0 && innerTrack.hitPattern.trackerLayersWithMeasurement > 5 && globalTrack.hitPattern.numberOfValidMuonHits > 0 && globalTrack.normalizedChi2 < 10'),
muons = cms.InputTag("muons"),
nmuons = cms.int32(1),
numGenericTriggerEventPSet = cms.PSet(
ReadPrescalesFromFile = cms.bool(True),
andOr = cms.bool(False),
andOrDcs = cms.bool(False),
andOrHlt = cms.bool(True),
andOrL1 = cms.bool(True),
dbLabel = cms.string(''),
dcsInputTag = cms.InputTag("scalersRawToDigi"),
dcsPartitions = cms.vint32(),
dcsRecordInputTag = cms.InputTag("onlineMetaDataDigis"),
errorReplyDcs = cms.bool(False),
errorReplyHlt = cms.bool(False),
errorReplyL1 = cms.bool(True),
hltDBKey = cms.string(''),
hltInputTag = cms.InputTag("TriggerResults","","HLT"),
hltPaths = cms.vstring('HLT_Dimuon25_Jpsi_noCorrL1_v*'),
l1Algorithms = cms.vstring(),
l1BeforeMask = cms.bool(True),
l1tAlgBlkInputTag = cms.InputTag("gtStage2Digis"),
l1tExtBlkInputTag = cms.InputTag("gtStage2Digis"),
stage2 = cms.bool(True),
verbosityLevel = cms.uint32(0)
),
offlinePVs = cms.InputTag("offlinePrimaryVertices"),
photons = cms.InputTag("photons"),
ptCut = cms.int32(0),
requireValidHLTPaths = cms.bool(True),
seagull = cms.int32(0),
stageL1Trigger = cms.uint32(2),
tnp = cms.bool(True),
trOrMu = cms.int32(0),
trSelection_ref = cms.string(''),
tracks = cms.InputTag("generalTracks")
)
which seems to show the inputs ARE specified. So something is up.
I don't fully follow yet, but I leave a quick guess: I suspect an instance of HLTPrescaleProvider
(created by BPHMonitor
) is creating a L1TGlobalUtil
which in turn makes a L1TGlobalUtilHelper
with empty InputTag
s, and that could lead to the routine searching for the preferred input collection, then the warning/error (links below).
Just for my education: in IB CMSSW_12_5_X_2022-08-09-1100
, I see 11634.911
failing for el8_amd64_gcc1*
, but not for slc7_amd64_gcc10
. Why is that?
https://github.com/cms-sw/cmssw/blob/e52f99a59bb77ceec0102769446b26c321662ae8/DQMOffline/Trigger/plugins/BPHMonitor.cc#L237 https://github.com/cms-sw/cmssw/blob/e52f99a59bb77ceec0102769446b26c321662ae8/HLTrigger/HLTcore/interface/HLTPrescaleProvider.h#L126 https://github.com/cms-sw/cmssw/blob/e52f99a59bb77ceec0102769446b26c321662ae8/L1Trigger/L1TGlobal/interface/L1TGlobalUtil.h#L255 https://github.com/cms-sw/cmssw/blob/e52f99a59bb77ceec0102769446b26c321662ae8/L1Trigger/L1TGlobal/interface/L1TGlobalUtilHelper.h#L95
So there appears to be a 'hidden' use of L1TGlobalUtilHelper
in BPHMonitor
and it is passed the "wrong" configuration.
where the module's full configuration is passed in which goes to https://github.com/cms-sw/cmssw/blob/e52f99a59bb77ceec0102769446b26c321662ae8/HLTrigger/HLTcore/interface/HLTPrescaleProvider.h#L126
and finally to
and the top level configuration for the module (which is what was passed) is of no use here and therefore we get the bad behavior.
So the question is, what is the purpose of the HLTPrescaleProvider in BPHMonitor and how was is SUPPOSED to be configured?
Just for my education: in IB
CMSSW_12_5_X_2022-08-09-1100
, I see11634.911
failing forel8_amd64_gcc1*
, but not forslc7_amd64_gcc10
. Why is that?
No clue, but I'm also wondering why we see the failures only in the .911 (DD4Hep) and .914 (DDD) workflow variations rather than everywhere.
I feel that the underlying failure in the design of this code is the use of existAs
to test if a parameter is available.
So the question is, what is the purpose of the HLTPrescaleProvider in BPHMonitor
It is used in various ways inside BPHMonitor
, for one to get the value of the prescale(s) of certain triggers, and this info is used to fill certain DQM outputs.
and how was is SUPPOSED to be configured?
I'd say using gtStage2Digis
.
Taking wf 11634.911
as an example, hltGtStage2Digis
(L1T digis of step-2) and gtStage2Digis
(L1T digis of step-3) are not literally the same, but I think they are effectively the same [*]. Other parts of the DQM module of type BPHMonitor
(and this likely applies to other DQM modules) use gtStage2Digis
, so that would likely be the one to pass to HLTPrescaleProvider
.
I plan to have a deeper look and, since this involves HLTPrescaleProvider
, I will assign HLT.
[*]
hltGtStage2Digis = cms.EDProducer("L1TRawToDigi",
CTP7 = cms.untracked.bool(False),
DmxFWId = cms.uint32(0),
FWId = cms.uint32(0),
FWOverride = cms.bool(False),
FedIds = cms.vint32(1404),
InputLabel = cms.InputTag("rawDataCollector"),
MTF7 = cms.untracked.bool(False),
MinFeds = cms.uint32(0),
Setup = cms.string('stage2::GTSetup'),
TMTCheck = cms.bool(True),
debug = cms.untracked.bool(False),
lenAMC13Header = cms.untracked.int32(8),
lenAMC13Trailer = cms.untracked.int32(8),
lenAMCHeader = cms.untracked.int32(8),
lenAMCTrailer = cms.untracked.int32(0),
lenSlinkHeader = cms.untracked.int32(8),
lenSlinkTrailer = cms.untracked.int32(8)
)
gtStage2Digis = cms.EDProducer("L1TRawToDigi",
FedIds = cms.vint32(1404),
InputLabel = cms.InputTag("rawDataCollector"),
MinFeds = cms.uint32(1),
Setup = cms.string('stage2::GTSetup')
)
In my understanding, the parameters not specified in gtStage2Digis
use the default values from fillDescriptions
. Comparing to those, the only difference between the two modules is MinFeds
, which is 0
at HLT and 1
in step-3. On the other hand, this parameter only controls the appearance of a LogWarning, so the actual numbers inside hltGtStage2Digis
and gtStage2Digis
should be the same (unless I'm missing something..).
assign hlt
New categories assigned: hlt
@missirol,@Martin-Grunewald you have been requested to review this Pull request/Issue and eventually sign? Thanks
In my understanding, the parameters not specified in gtStage2Digis use the default values from fillDescriptions.
Unfortunately, the BPHMonitor::fillDescriptions
does not contain entries needed to configure the InputTags used by the HLTPrescaleProvider
. I'm presently working on a PR that does add the needed parameters and sets them to gtStage2Digis
by default.
I'd say using
gtStage2Digis
.
I agree.
In general (for example, during production jobs), hltGtStage2Digis
should not be available during step 3.
+hlt
Starting from
CMSSW_12_5_X_2022-08-09-1100
, some workflows are failing with:e.g. 11634.911. First spotted in https://github.com/cms-sw/cmssw/pull/38760#issuecomment-1209651811.