Open gladky opened 7 years ago
We should also check against very low physics trigger rates at input as suggested in #92 .
In today's run coordination meeting, run coordination expressed strong interest in having this implemented.
@fwyzard also mentioned that there we could distinguish two cases:
Indeed, using the wrong column could give some high rate, maybe also 200 kHz (e.g. using the 1e34 column at ~1.8e34...). On the other hand, noisy towers can give rates in the range of 150-300 kHz.
So I'm not sure what would be the best cut between "wrong prescale column" and "detector effects".
.A
I think we should really get this check online as soon as possible. Last night we had an instance of a hot ECAL trigger tower which caused a very high trigger rate. CT-PPS FEDs were in busy:
http://daq-expert.cms/daq2view-react/index.html?setup=cdaq&time=2017-08-18-05:13:00
Unfortunately, the DAQ shifter red-recycled CT-PPS twice before the shift crew did the right action and red-recycled ECAL. We lost about 15 minutes of stable-beam time due to this ):
By the way, I would propose to put the threshold between the two cases (wrong L1 prescale vs detector misbehaving) at 200 kHz to start with, and adjust it later as needed.
.A
One way to sort out the two cases would be to instruct the DAQ shifter to either check the individual level 1 rates or ask for them to be checked by the trigger shifter. This could be in the message issued by the expert. I think it is not unconceivable to provide the DAQExpert with some input from the L1 to be able to to distinguish the two. I think, but I am not sure, that one quick way, but not very accurate, is to check the HLT Physics output rate. In the case of a wrong column it will be distinguishably too high whereas in the case of a hot tower it should remain within reasonable values...
What you describe is already the job of the trigger shifter...
Yes, the DAQExpert could check the pre/post deadtime of few individual L1 triggers: L1_SingleMu##, L1_SingleEG##, L1_SingleJet##, L1_HTT##, L1_ETM##, L1_ETMHF## to suggest if one specific subdetector is causing troubles.
It could also compare the current luminosity with the prescale column...
.A
There was another identical case of blaming the CT-PPS FEDs while ECAL was causing a high trigger rate: http://daq-expert.cms/daq2view-react/index.html?setup=cdaq&time=2017-08-20-21:22:57
@hsakulin suggested to monitor trigger rate before L1. The expected value is below 150kHz but here is the case reported by @hsakulin where it was 40MHz:
http://daq-expert.cms/DAQExpert/?start=2017-06-26T06:26:24.551Z&end=2017-06-26T09:05:25.655Z
This resulted in deadtime. We have the values in the snapshot:
The way LM will work is compare sum of
sup_trg_rate_total
andtrg_rate_total
to threshold of 150kHz.