Closed mommsen closed 6 years ago
Thank you for reporting this case. The reason why it was not satisfied was that the conditions were oscillating around the threshold, some of them were satisfied for a duration of one snapshot while others weren't. In this case we were dealing with upgraded fed so we required TTS deadtime that should be greater than 2% - it wasn't, it was 1.78%.
Backpressure from EvB has following conditions:
[1] Deadtime due to DAQ has following conditions:
[2] upgraded FED problem has following conditions:
exists TTSDeadtime (for that snapshots 1.78% thus not satisfied)
exists upgraded FED that has backpressure greater than threshold (2%)
note that we have no per FED deadtime
Note that few minutes later there was a short occurrence of Backpressure from FEROL
http://daq-expert.cms:8080/DAQExpert/?start=2018-08-14T17:56:19.129Z&end=2018-08-14T17:57:19.129Z
This situation will repeat given the conditions oscillating around thresholds. We could think of some solution that would prevent conditions from appearing and fading. The first thing that comes to my mind is firing the LM at given threshold X and keep them satisfied until a value drops below 0.5X. If you have other ideas please let me know.
IMHO, we should drop the requirement on TTSDeadtime if a FED gets backpressured from DAQ. However, I see several instances where TTS is > 2%, e.g. http://daq-expert.cms/daq2view-react/index_fb_dt.html?setup=cdaq&time=2018-08-14-19:46:28
@mommsen, I identified another factor that prevented this condition from firing in this period. Since this is an upgraded FED the individual deadtime was not available. LM needed to verify which FEDs had deadtime and never found upgraded FEDs, I fixed that.
I run the new version of DAQExpert on this period. I received following result:
Backpressure from Event Building (i.e. not from HLT). Exists FEDBuilders with backpressure to FEDs 1386 (( last: 5.3%, avg: 6.9%, min: 5.3%, max: 8.5%)) and 0 requests on RU, 256 fragments in RU. EVM has few (( last: 0, avg: 0.5, min: 0, max: 1), the threshold is <100) requests. All BUs are enabled.
Depending on whether we turn of the TTSDeadtime requirement we have following number of occurrences:
Great, thanks for finding this bug! Are these entries only from 19:40-20:00 on Aug 14, or did you find other instances, too?
Let's discuss with Hannes next week if we shall drop the TTSDeadtime condition in case that there is backpressure from DAQ.
Remi
Only from that evening. So for the time being I will prepare a new release to be deployed on the closest occasion. I leave this issue open to decide what we do with TTSDeadtime with @hsakulin
@gladky, when do you plan to release a new version which includes this fix?
@remi i was traveling this weekend I will be back tomorrow. We could schedule it for tomorrow.
Cheers Maciej
On Tue, 21 Aug 2018, 9:23 a.m. Remi Mommsen, notifications@github.com wrote:
@gladky https://github.com/gladky, when do you plan to release a new version which includes this fix?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cmsdaq/DAQExpert/issues/236#issuecomment-414576256, or mute the thread https://github.com/notifications/unsubscribe-auth/ABtzC0xUnZi683xQDqWsj7JYbWxS321oks5uS7WMgaJpZM4V-C-O .
Related hotfix in 2.15.1 Last thing to do is to merge entries when interrupted by short monitoring fluctuations. Opening a separate issue for this #238
Hi,
yesterday evening we had a case where the EvB was causing backpressure on FED 1386: http://daq-expert.cms/daq2view-react/index.html?setup=cdaq&time=2018-08-14-19:53:18
The DAQExpert did not diagnose that the backpressure was coming from the EvB, despite that the conditions were satisfied, i.e. FEDBuilders with backpressure to FEDs and 0 requests on ru-c2e14-16-01, 256 fragments in RU, and EVM has few (<100) requests. All BUs were enabled.
Do you trigger this diagnostics only when there is no rate? This should be triggered whenever there is backpressure from DAQ, imho.
Cheers, Remi