I was talking with Wes just now about some current issues in tracking and transmitting routing information in artdaq, and we came up with an idea for a model in which there is an in-memory 'database' on each DAQ node that has user-definable regions (tables?) that are required to be replicated to every other node in a very performant way and other regions that do not need to be replicated, but could be examined by monitoring processes.
[I'm mentioning this to you as input to our discussion about areas that people might contribute as part of the UK/FNAL DUNE back-end DAQ collaboration.]
We could store the status of the EventBuilder buffers in the replicated region and use that information to determine fragment routing.
The non-replicated regions could be used to store information that is useful in monitoring, but is not needed for routing.
In our discussion, the monitoring is a central, required piece of this. I think of the CDF Level3 Trigger 'flashing boxes' display that showed the status of every buffer and process in the L3 system.
An example of something that would go into the replicated section of the DB: the status of each EventBuilder buffer (free, assigned, filling, full, reading, whatever). An example of something that would go into the non-replicated part is the number of fragments that have been received into a buffer that is filling. This latter information would be useful for drill-down monitoring, but wouldn't be needed for routing.
Clearly, the replication would need to be very performant if we want to use it for routing in a high-event-rate system. But, this project could/should start from the monitoring side, which is valuable in, and of, itself.
This issue has been migrated from https://cdcvs.fnal.gov/redmine/issues/20456 (FNAL account required) Originally created by @bieryAtFnal on 2018-07-26 15:21:47
Here is part of the email that I sent to Giles:
Related issues:
Related issues: