DUNE-DAQ / trigger

Trigger infrastructure of the DUNE DAQ
0 stars 6 forks source link

Asztuc/ta to tc workflow #327

Closed ArturSztuc closed 3 months ago

ArturSztuc commented 3 months ago

Changing the TAM->TASet->TCM TriggerActivity workflow to TAM->TA->TCM workflow, removing TASets, having individual TAs being sent as soon as they are made, and removing the TAZipper. This significantly improves the latency, from order of seconds and 10s of seconds to 10s-100s of milliseconds.

This PR goes together with https://github.com/DUNE-DAQ/daqconf/pull/490

Full list of changes:

  1. Changing TAMaker to use the TXSet-in, TZ-out template (TPSet in, TA out), rather than TXSet-in, TZSet-out template.
  2. Removing TAZipper.
  3. Adding new TATee, to replace the TASetTee.
  4. Changing the TABuffer to deal with TriggerActivity input rather than TASet input.

Potential issue: Removing the TAZipper means the TAs now go into TCMaker potentially unordered, if we have multiple data streams (subdetectors, planes). This could be an issue for dfo if not written to handle tardy inputs, and could be an issue for more complex TCMakers that require ordered TAs. We do not use these TCMakers in PD2, however, so this update really targets v4 towards PD2 running -- and these issues are already resolved in v5.

Tests:

  1. 3ru_3df_multirun_test.py from https://github.com/DUNE-DAQ/daqsystemtest passes.
  2. Compiles, runs, the output from replay application & offline runs with asset files don't look different.
  3. Tested in PD2HD with ADCSimpleWindow algorithm targeting ground-shakes, and uses the new TA workflow. Latencies significantly reduced, ground shakes still visible as before. Details HERE on slide 5,6,7
MRiganSUSX commented 3 months ago

I can confirm this passes selection of commonly used integration tests: minimal_system_quick_test.py, small_footprint_quick_test.py, tpstream_writing_test.py, fake_data_producer_test.py, 3ru_1df_multirun_test.py.

Tested offline (combined with https://github.com/DUNE-DAQ/trigger/pull/328) and there were significant improvements to latencies (from 10+ avg (40 peaks) to <1s avg (<3s peaks)).

Additionally tested with np04 and after tuning of MLT parameters also performed significantly better than system before this change. It should be mentioned that the tests were constructed to work with simple algos (prescale, ADCSW), it is possible there could be issues for complex algos where time variations between making a TA are bigger. As a PD2 aimed change, this performed great.