MWATelescope / mwax_mover

MWA Correlator (mwax) mover. Gluing components together.
3 stars 0 forks source link

Extract packet bitmap, transform and provide to M&C for grafana, etc #26

Open gsleap opened 3 months ago

gsleap commented 3 months ago

@andreww5au @shrydar for your comment/review/suggestions

Intro

We want to pass the packet bitmap info in some form to the M&C system for plotting and historical trend analysis.

How might it work?

MWAX_subfile_processor should, for every subfile it handles:

Perhaps the data can be summarised by _rfinput in 1 second averages? e.g. mwax_mover on each MWAX server would produce, per 8 seconds:

So it will be trivial to average 625 (or 800) bits per second per rf_input and store the result.

This would mean that at 256T:

Maybe this could be fed into influx db and averaged down as it ages?

Questions

shrydar commented 3 months ago

Yes, an average per second sounds good - or even just a raw count of lost packets (so, up to 625 or 800) would only need 10 bits at most, so a 16 bit integers would be just fine.

I'd been considering runlength encoding the lost packet map, but a count would be much simpler to implement and in practice gives about the same amount of information. We could perhaps also include a count of the number of runs of missing packets though? That's only one extra number per input per second, would max out at at most 400 groups per input per second, and would give us some idea about whether we're seeing a random scattering of unreliable transfers or just a few sizable blocks of downtime.

I do like the idea of using a format that at least encapsulates the array dimensions, so the reader doesn't have to rummage elsewhere to determine the shape. It'd be nice to include the mapping from channel index to receiver id too.

Slightly leaning towards FITS just to try to avoid further proliferation of container formats? I've not looked at how efficient writers are for that yet, nor do I know what M&C would be happiest with. If we could compress it fast enough we could just thrown in the entire map for later investigation, but that's probably more information than we really need for a dashboard.

shrydar commented 3 months ago

(Also note that reporting per second rather than per eight second subobservation would require some careful bit masking when reading the packet map, as each eight seconds' worth of flags is stored as one 625 byte bitmap per input, and the one second breaks don't land on byte boundaries)