cmsdaq / DAQExpert

New expert system processing data model produced by DAQAggregator
1 stars 2 forks source link

uncaught case in BackpressureAnalyzer #145

Open andreh12 opened 6 years ago

andreh12 commented 6 years ago

in this case: http://daq-expert-dev.cms/DAQExpert/?start=2017-11-02T12:57:12.067Z&end=2017-11-02T13:03:05.011Z the RuFailed module triggered (which is a catch-all module for otherwise unidentified problems with RUs in failed state).

The timestamp in the snapshot linked to by the DAQExpert is 1509627553663.json.gz.

The error message was:

"Caught exception: exception::DataCorruption 'Received a corrupted event 2 from FED 1221 (PIXEL): FED header \"eventid\" 1187777 does not match the eventNumber found in FEROL header, and inconsistent event size: FED trailer claims 208 Bytes, while sum of FEROL headers yield 416.  In addition, the FED trailer indicates that wrong slink CRC checksum was found by the FEROL (FED trailer C bit is set). Received 2 as first event number (should be 1.) Have the buffers not be drained?' raised at reportErro"

We should detect this case in BackpressureAnalyzer and either treat it the same as CorruptedDataReceived or OutOfSequenceDataReceived or introduce a new subcase. (the reason why the RU fails is the out of sequence data but the more fundamental problem is that the data sent is corrupted)