bluesky / bluesky-enhancement-proposals

0 stars 4 forks source link

Support multiple open runs at a once #3

Closed danielballan closed 3 years ago

danielballan commented 6 years ago

Use case: a temperature ramp (forming an outer loop) over a set of a samples (forming an inner loop) wherein each sample should get one run, and the measurements of the runs are interleaved in time.

This would involve an invasive change to most callbacks, including user-defined ones.

CJ-Wright commented 6 years ago

Why can't this use case be one giant run engine call?

danielballan commented 6 years ago

It is.

tacaswell commented 6 years ago

I think the callback refactor can be isolated to writing a callback factory callback which handles creating instances of the stateful callbacks on seeing a new start and then dispatching them the correct instance for everything else. As using this feature would be strictly opt-in, I think managing this change would not be too bad.

CJ-Wright commented 6 years ago

Overall I am against this change. Not only does it break a couple of things (including callbacks) but it fails to accurately represent the experiment. An experiment that is done in one large run engine call is different than a few separate run engine calls. For the use case given above we won't know the difference between a setup where one sample is loaded and the temperature is changed and the multi sample experiment since both emit a single run start per experiment. However these experiments could be different in the results they generate as the procedures were not done in the same manner. I'd prefer that if there was an outer loop over the temperature and an inner one over the sample that the results (document stream) give back just that. There are other means to separate the samples in a multi sample experiment (eg providing a map between motor position and sample metadata in the start document, or providing separate event streams for each sample with a stream to sample map in the start document and other methods).

I'd rather that we try other options (or at least discuss why they are untenable) first and see their breaking points before we make this change.

tacaswell commented 6 years ago

All of those suggestions impose extra layers of complexity in understanding the contents of the start document. For example, by making the start : sample mapping many-to-many searching for a given sample is significantly more complex due to the extra layer of nesting.

In the case where you have a sample plate of 6 samples and you run 3 temperature ramps over all 6 and then a couple of stand-alone measurements of each of them. A query that should be easy is "give me the data from every measurement of sample 3". Letting each temperature ramp generate 6 start documents makes this easy as the de-interlacing is taken care of by the structure of the documents and DataBroker. If you have (sometimes) saved them interlaced all of that de-interlacing needs to happen in the client layer.

If there really is concern that the number of samples run in the inner loop of a temperature ramp is significant then that can be handled by flagging it in the start documents or by just not using this feature.

It is not obvious to me that in all cases the details of the nesting are relevant (for example an aging experiment or the outer loop is an energy scan) so even if there are cases where you do want to do the above nesting, that is not a reason to not do it in other cases.

There are some cases, such as the sample plate being from different user groups (and hence on different proposals), that will mandate having multiple start documents open.

There are some interesting wrinkles here, like do we broadcast monitors to all open runs?

prjemian commented 6 years ago

As you said,

On 10/17/2017 12:39 PM, Thomas A Caswell wrote:

If there really is concern that the number of samples run in the inner loop of a temperature ramp is significant then that can be handled by flagging it in the start documents or by just not using this feature.

wouldn't that coordination be aided by adding a metadata identifier common to the set(s)?

danielballan commented 3 years ago

Support for multi-run plans was added in bluesky 1.6.0 and documented at https://blueskyproject.io/bluesky/multi_run_plans.html. The built-in callbacks (LiveTable, LivePlot) have undefined (likely broken) behavior if subscribed directly to a sequence of documents with interleaved runs, but the event_model.RunRouter can be inserted to ensure that each instance only ever sees one Run.