bluesky / databroker

Unified API pulling data from multiple sources
https://blueskyproject.io/databroker
BSD 3-Clause "New" or "Revised" License
35 stars 47 forks source link

Question: Is it possible to query on the descriptor? #210

Open CJ-Wright opened 7 years ago

CJ-Wright commented 7 years ago

If we move to a system which has a separate stream for dark data it would be helpful to query for data which has a dark data stream.

danielballan commented 7 years ago

Not from the databroker API, but you can greedily load a bunch of Headers into RAM and then do finer filtering in Python.

In theory we could expose this in the databroker API, but that feature request would have to get in line behind serious performance concerns that richer indexing would worsen. Not likely to happen in 2017.

tacaswell commented 7 years ago

A better ask is the ability to have more than one run open at a time in bluesky. That way you can put the dark frames in their own run while holding the data collection run open.

CJ-Wright commented 7 years ago

I thought that was going to be a bit of an anti-pattern, but our use case is taking darks without putting in a stop document on the lights so whatever gets us there makes me reasonably happy.

stuartcampbell commented 7 years ago

I might be missing something, but why do we need to query the descriptor for this ? Wouldn't you just store the fact you are doing a dark frame into the start document ?

Or do we always measure the dark frame as part of the main data collection ? as in the same data stream ?

CJ-Wright commented 7 years ago

The setup may be like this:

  1. User sets a timeout for darks or we have some automated analysis which says we need to take a new dark
  2. User runs experiments
  3. For some short experiments new darks may not be needed
  4. For others we may need new darks

We don't know apriori if we took a dark (thus can't stuff something into the start). We could potentially put the query information into the stop document? Now this issue is discussing two issues.

  1. What is the best way to take darks without splitting the light data into multiple headers?
  2. How should that data be noted in the metadata for searching?