Support separate "off-to-the-side" applet for in-progress subscans

dnadlinger commented 7 months ago

Subscans, including on-kernel subscans, sometimes take a long time to run (e.g. many minutes). Currently, however, the results are only pushed to datasets and available in the applets once the entire subscan has completed, i.e. a complete point of whatever top-level scan has been completed.

It would be nice to see the data incrementally as it comes in (to spot issues early when debugging code/system issues, and for the impatient experimentalist in general).

This shouldn't require much more than just writing out the subscan data to a configurable AppendingDatasetSink in addition to the ArraySink currently employed, and emitting metadata like in TopLevelRunner._broadcast_metadata. We'll also need to make sure that the applet in fact handles complete rewrites of al the scan information correctly (#387) as the subscan proceeds from point to point.

@JammyL This is part 1 of the RBM live analysis discussion we had on 2023-05-05 (part 2 being the option to execute online analyses master-side).

In this context, we had discussed an alternative option where result channels could take "preview"/... values, such that the entire subscan result can be updated multiple times.

However, first of all, we'd need to make sure we can efficiently modify say the result array for a subscan of a subscan without rewriting the entire array every time. This should be possible with sipyco.sync_struct (there are "Mod"s for modifying array indices), but is probably a largely untested code path.

Also, what would happen if the experiment is interrupted? We need to make sure never to end up with jagged arrays, as this would mean that the dataset is not saved to HDF5 at all, yet interrupted subscans would naturally lead to an "unfinished row" in the subscan result channels. This could be worked around by having a "cleanup routine" that chucks away the incomplete data, but seems annoying.

JammyL commented 7 months ago

Also, what would happen if the experiment is interrupted? We need to make sure never to end up with jagged arrays, as this would mean that the dataset is not saved to HDF5 at all

In the current implementation this isn't an issue because the data is only posted to a dataset at the end of each point (subscan)? For subscans the array is of a known fixed length. When working with the 2D plot colorbars (#329), padding points with np.nan worked very seamlessly. Perhaps a similar approach would work here, but in the datasets.

dnadlinger commented 7 months ago

For the avoidance of doubt, these are my notes from the ABaQuS lab book back then, mostly 1:1. I still think the "in-progress subscan off to the side in a separate applet" is probably in the sweet spot of not complicating the design while providing the desired functionality.

OxfordIonTrapGroup / ndscan

Support separate "off-to-the-side" applet for in-progress subscans #388