Open Dr15Jones opened 6 years ago
A new Issue was created by @Dr15Jones Chris Jones.
@davidlange6, @Dr15Jones, @smuzaffar, @fabiocos, @kpedro88 can you please review it and eventually sign/assign? Thanks.
cms-bot commands are listed here
Looking deeper at workflow 10224, I see the histograms arew generated in step3 by the module labelled l1tStage2CaloLayer2
. Dumping that module gives
>>> print process.l1tStage2CaloLayer2.stage2CaloLayer2TauSource
cms.InputTag("caloStage2Digis","Tau")
Then looking at caloStage2Digis
we see
>>> print process.caloStage2Digis.dumpPython()
cms.EDProducer("L1TRawToDigi",
FWId = cms.uint32(0),
FWOverride = cms.bool(False),
FedIds = cms.vint32(1360, 1366),
MinFeds = cms.uint32(1),
Setup = cms.string('stage2::CaloSetup'),
TMTCheck = cms.bool(True)
)
This appears to use plugins to do the actual creating of the data to be put into the event. It appears the Taus have their hardware Phi set here:
the value of raw_data
comes from the object passed into the function:
https://github.com/cms-sw/cmssw/blob/78c11d459eefc961686b4b122d2e2850b0cb3ab2/EventFilter/L1TRawToDigi/plugins/implementations_stage2/TauUnpacker.cc#L44
The allowed values for the Phi histogram is 0 10 143: https://github.com/cms-sw/cmssw/blob/8af7aff587d8f0cec517e6606d0ae4153fd20668/DQM/L1TMonitor/src/L1TStage2CaloLayer2.cc#L82
From the data encoding, phi can be from 0 to 255. Therefore if the high bit was someone set on, then we could get an overflow. However, looking at the value of the 'missing' Phi, it appears to be ~134 which is a value that already has the 8th bit set.
assign l1
New categories assigned: l1
@rekovic,@nsmith-,@thomreis you have been requested to review this Pull request/Issue and eventually sign? Thanks
From the best I can tell, here is where the packing is done https://github.com/cms-sw/cmssw/blob/09c3fce6626f70fd04223e7dacebf0b485f73f54/EventFilter/L1TRawToDigi/plugins/implementations_stage2/TauPacker.cc#L24
@Dr15Jones I think that the variations observed in https://github.com/cms-sw/cmssw/pull/24274#issuecomment-412595608 falls in the same category
@rekovic @thomreis there has been any follow-up on this issue? We got reports of non reproducibility even recently...
@Dr15Jones @fabiocos We are looking into this. attn @thomreis @bundocka
@rekovic @thomreis this is now hitting also in the test of unrelated code as TBB: https://github.com/cms-sw/cmsdist/pull/4415
I have tried to run WF 136.788 in CMSSW_10_3_0_pre6 with and without PR #24837 and get exactly the same number of entries (759) in both cases. Also in agreement with the blue spectrum in [1]. I don't know if the blue one is with or without 24837. So it looks as if the error is not reproducible by just running the matrix.
There is probably uninitialized memory somewhere. I recommend running valgrind. NB, you should use cmsRunGlibC
instead of cmsRun
with valgrind.
So I was looking in detail at the results from
L1TStage2CaloLayer2
. The histograms that are showing differences are booked in this directory: https://github.com/cms-sw/cmssw/blob/8af7aff587d8f0cec517e6606d0ae4153fd20668/DQM/L1TMonitor/src/L1TStage2CaloLayer2.cc#L78and are filled within this
if
block: https://github.com/cms-sw/cmssw/blob/8af7aff587d8f0cec517e6606d0ae4153fd20668/DQM/L1TMonitor/src/L1TStage2CaloLayer2.cc#L239The comparison tests say only TausPhi, TausEtEtaPhi and TausOcc differ by 1 entry, while TausEta and TausQual histograms are filled that the same time but do not show any difference in entries. I interpret this to mean that we do have the exact same number of Taus in the two jobs. Now the three histograms showing differences all use Phi in their binning. In we had a Phi that was shifted to be out of range of the histogram binning, that would explain the histograms not depending on Phi in the two job show the same entries while the ones use Phi differ by one.
I believe @slava77 has said that some of the L1 plots are known to have differences based on changes in underflow/overflow, so this is probably one of those cases.
[originally comment in #24270 ]