Open EliotGann opened 5 years ago
I would like to see a way to resolve this as well. We've gone back and forth about whether this should be done at the ophyd
or bluesky
level.
Our usuall approach for this use case—ring current—is to monitor it. This produces a series of events with its own time base, with a time spacing appropriate to the variability of that specific signal. Downstream, either in live analysis/visualization code or in code operating on the saved data, you can resample the streams (averaging, first and last, etc.) based on timestamps in whatever way is appropriate to the scientific application.
See “We’ll Cross The Streams” under the tutorials directory at try.nsls2.bnl.gov for an example.
The first 2 (a full fidelity monitor and a snap shot at the beginning and the end via baseline devices) are both already available.
A Device that averages would also fit the Flyer
interface (for the less-passive aggressive version of
complete
), but as currently implemented you would not get those live.
Wonder is something like Derived
signal would work here?
class ProcSignal(Signal):
def __init__(self, sibling_name, time_window, func):
self.lock = threading.RLock()
self._buffer = []
self._func = func
self.time_window = time_window
getattr(self.parent, sibling_name).subscribe(self._aggregate)
self._th = threading.Thread(target=self._thread_worker)
self._th.start()
def _aggregate(self, value, **kwargs):
with self._lock:
self._buffer.append(value)
def _thread_worker(self):
while True:
time.sleep(self.time_window)
with self._lock:
old_buffer = self._buffer
self._buffer = []
v = self.func(old_buffer)
# jump to the parent!
Signal.put(v)
def put(self, *args, **kwargs):
raise Exception("no, read only")
That said, if multiple timescales of a PV are useful to bluesky, they are probably useful in other contexts and it would be better to have an IOC that did this averaging that we could just monitor instead.
+1 ... I agree with Tom and Dan on this
On Thu, Mar 28, 2019, 11:36 PM Thomas A Caswell notifications@github.com wrote:
The first 2 (a full fidelity monitor and a snap shot at the beginning and the end via baseline devices) are both already available.
A Device that averages would also fit the Flyer interface (for the less-passive aggressive version of complete), but as currently implemented you would not get those live.
Wonder is something like Derived signal would work here?
class ProcSignal(Signal): def init(self, sibling_name, time_window, func): self.lock = threading.RLock() self._buffer = [] self._func = func self.time_window = time_window getattr(self.parent, sibling_name).subscribe(self._aggregate) self._th = threading.Thread(target=self._thread_worker) self._th.start()
def _aggregate(self, value, **kwargs): with self._lock: self._buffer.append(value) def _thread_worker(self): while True: time.sleep(self.time_window) with self._lock: old_buffer = self._buffer self._buffer = [] v = self.func(old_buffer) # jump to the parent! Signal.put(v) def put(self, *args, **kwargs): raise Exception("no, read only")
That said, if multiple timescales of a PV are useful to bluesky, they are probably useful in other contexts and it would be better to have an IOC that did this averaging that we could just monitor instead.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/NSLS-II/ophyd/issues/706#issuecomment-477796112, or mute the thread https://github.com/notifications/unsubscribe-auth/ACLKMP1UqjWNU_RA5v4n_I4IosX7zSrGks5vbUPUgaJpZM4cRPCD .
The first 2 (a full fidelity monitor and a snap shot at the beginning and the end via baseline devices) are both already available.
@EliotGann See the section of the bluesky tutorial on supplemental data.
The short of it is:
# Configure all plans to take supplemental readings in addition to 'primary' data stream.
sd.baseline.extend(list_of_motors_to_snapshot_at_start_and_finish)
sd.monitors.extend(list_of_signals_like_ring_current_to_monitor_asynchronously)
This will create a 'baseline'
stream and then one additional stream for each monitored signal with a name derived from the name of that signal.
To quote your email to @mrakitin:
change the name of the suitcased files to remove primary etc etc stuff from the name
This is why you might want to retain the word 'primary' in the name---to distinguish from baseline and monitor data streams.
We have some versions of the averaged signal in our local device repo that is shocking similar to @tacaswell suggestion.
https://github.com/pcdshub/pcdsdevices/blob/master/pcdsdevices/signal.py
In practice it isn't so useful because you have to pick at class creation time which signal you want to average.
Thanks! The baseline and monitor streams have basically filled my need at this point, although I think it puts a bit of a burden on rewriting analysis code. With the prevalence of intensity monitor normalization it would make a lot of sense to me experimentally to have a "monitor_average" type of stream which would output at the end of a scan just like the second baseline, but with one number which could be used for normalizing. In my case, the ideal output is an average only while the shutter is open, which is a subset of the scan time. I guess the proper treatment for that case is to read the PV of the shutter status as a second monitor, and then cross that stream with the Izero monitor to average only the values while the shutter is open?
This is a great case study for how to handle situations like this.
What you described is a data reduction process. It is not obvious that the data reduction provides information that will influence subsequent collection steps. Use Bluesky and ophyd to collect the raw data you need, then apply the steps you described as part of data reduction. Your algorithm for reduction does not have to be burdened with trying to blend its results into collection. With that, you have a much cleaner, sharper interface between collection and processing.
Pete
On Mon, Apr 1, 2019, 10:37 AM EliotGann notifications@github.com wrote:
Thanks! The baseline and monitor streams have basically filled my need at this point, although I think it puts a bit of a burden on rewriting analysis code. With the prevalence of intensity monitor normalization it would make a lot of sense to me experimentally to have a "monitor_average" type of stream which would output at the end of a scan just like the second baseline, but with one number which could be used for normalizing. In my case, the ideal output is an average only while the shutter is open, which is a subset of the scan time. I guess the proper treatment for that case is to read the PV of the shutter status as a second monitor, and then cross that stream with the Izero monitor to average only the values while the shutter is open?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/NSLS-II/ophyd/issues/706#issuecomment-478630747, or mute the thread https://github.com/notifications/unsubscribe-auth/ACLKMHuKYHSWQM7bkz6DYMI2JUysVBI3ks5vcifKgaJpZM4cRPCD .
I don't understand how it is just data reduction. Flux is as fundamental to measurements as time, and if they were available for such could really feedback into collection quite naturally. As an example, in the past there have been features on beamlines for "expose for X time, or expose for Y flux". Any experiment should want at least one Izero type value as a single number related to each exposure, so it just seems strange to have it not immediately accessible with the rest of the data.
The difference is that what you described is computed from raw signals.
On Mon, Apr 1, 2019, 12:04 PM EliotGann notifications@github.com wrote:
I don't understand how it is just data reduction. Flux is as fundamental to measurements as time, and if they were available for such could really feedback into collection quite naturally. As an example, in the past there have been features on beamlines for "expose for X time, or expose for Y flux". Any experiment should want at least one Izero type value as a single number related to each exposure, so it just seems strange to have it not immediately accessible with the rest of the data.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/NSLS-II/ophyd/issues/706#issuecomment-478662435, or mute the thread https://github.com/notifications/unsubscribe-auth/ACLKMP3w45AKWdoe_Ej4wjXy42A22A0jks5vcjwlgaJpZM4cRPCD .
Thinking through this further, you could construct an ophyd Device that
takes, in its constructor, the existing Signals needed to make the
computation. Then, in its trigger()
(or is it read) method, do the
computation. The result would be a (non EPICS) Signal of that device. You
would read this Device along with your other Devices and Signals.
I would look at the synthetic gaussian signal for an example.
On Mon, Apr 1, 2019, 12:11 PM Pete Jemian prjemian@gmail.com wrote:
The difference is that what you described is computed from raw signals.
On Mon, Apr 1, 2019, 12:04 PM EliotGann notifications@github.com wrote:
I don't understand how it is just data reduction. Flux is as fundamental to measurements as time, and if they were available for such could really feedback into collection quite naturally. As an example, in the past there have been features on beamlines for "expose for X time, or expose for Y flux". Any experiment should want at least one Izero type value as a single number related to each exposure, so it just seems strange to have it not immediately accessible with the rest of the data.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/NSLS-II/ophyd/issues/706#issuecomment-478662435, or mute the thread https://github.com/notifications/unsubscribe-auth/ACLKMP3w45AKWdoe_Ej4wjXy42A22A0jks5vcjwlgaJpZM4cRPCD .
A related class while thinking about this: something that monitors a PV when triggered and watches for it to fire N times
from ophyd import Device, Component as Cpt, EpicsSignal, Signal, DeviceStatus
class AccumulateSignal(Device):
target = Cpt(EpicsSignal, 'thermo:I')
window_size = Cpt(Signal, value=5)
last_read = Cpt(Signal, value=[])
def trigger(self):
dbuffer = []
count = 0
target_N = self.window_size.get()
status = DeviceStatus(self)
def accumulating_callback(value, **kwargs):
if status.done:
self.target.clear_sub(accumulating_callback)
nonlocal count
dbuffer.append(value)
count += 1
if count >= target_N:
self.last_read.put(dbuffer[:target_N])
self.target.clear_sub(accumulating_callback)
status._finished()
self.target.subscribe(accumulating_callback, run=False)
return status
def unstage(self):
self.last_read.put([])
ps = AccumulateSignal(name='ps')
Hello, I was pointed to this thread in a discussion with @tacaswell while trying to get started with bluesky and getting real detectors integrated in my bluesky lab setup. I really love the flexibility and capabilities of bluesky, but unfortunately had quite a steep learning curve with some frustrating setbacks. I had the feeling that while the tutorials available online nicely and didactically demonstrate the various capabilities of bluesky and the hardware abstraction layer, I was missing a number of important steps to understand how to integrate real hardware, as most of the tutorials rely on a preset simultion configuration, and I did not find some necessary explanation about its setup and configuration, nor the steps to adopt this for real hardware. As an example, a tutorial on how to write such trigger functions in a device class as discussed here would be really helpful, I think, Another example would be an explanation for how in this context the "kind" keywords in the Component affect e.g. reading, scanning and plotting, and how the PV prefix can be used for class instances. Also I was missing from the tutorials further information on the databroker. The tutorials show how to use "temp file" databrokers, and at the same time discurrage using them, but I did not easily find further information on what other options are available.
Thanks for this specific feedback, @rettigl. It is very helpful. I am in the process of rewriting the ophyd documentation, with an emphasis on how-to guides covering topics such as those you mention. I would be grateful if you would review that work when it is ready 1–2 weeks from now. You are a good representative of its intended audience.
@danielballan Sure I will be happy to comment.
It would be very useful for metadata including intensity-type of signals, or slewing motors, or general signals/motor positions not directly connected to the scan, to be recorded at sub-exposure rate, rather than just a snapshot at the beginning or end of an exposure. As an example, in a 30 second exposure, taking a snapshot of ring current before or after does not adequately capture what may have happened during the detector integration. If a signal is often changing during an areadetector integration, the proper treatment would be to average all of these signals (or potentially integrate them, depending on the type of signal). Options for output of a full array of datapoints, one value taken at the start, one value at the end, or an average value, should be standard. I'm not sure if this is better for ophyd or bluesky, but I was advised to put this here.