bluesky / ophyd

hardware abstraction in Python with an emphasis on EPICS
https://blueskyproject.io/ophyd
BSD 3-Clause "New" or "Revised" License
51 stars 79 forks source link

option for signals to be averaged during an acquisition #706

Open EliotGann opened 5 years ago

EliotGann commented 5 years ago

It would be very useful for metadata including intensity-type of signals, or slewing motors, or general signals/motor positions not directly connected to the scan, to be recorded at sub-exposure rate, rather than just a snapshot at the beginning or end of an exposure. As an example, in a 30 second exposure, taking a snapshot of ring current before or after does not adequately capture what may have happened during the detector integration. If a signal is often changing during an areadetector integration, the proper treatment would be to average all of these signals (or potentially integrate them, depending on the type of signal). Options for output of a full array of datapoints, one value taken at the start, one value at the end, or an average value, should be standard. I'm not sure if this is better for ophyd or bluesky, but I was advised to put this here.

teddyrendahl commented 5 years ago

I would like to see a way to resolve this as well. We've gone back and forth about whether this should be done at the ophyd or bluesky level.

danielballan commented 5 years ago

Our usuall approach for this use case—ring current—is to monitor it. This produces a series of events with its own time base, with a time spacing appropriate to the variability of that specific signal. Downstream, either in live analysis/visualization code or in code operating on the saved data, you can resample the streams (averaging, first and last, etc.) based on timestamps in whatever way is appropriate to the scientific application.

See “We’ll Cross The Streams” under the tutorials directory at try.nsls2.bnl.gov for an example.

tacaswell commented 5 years ago

The first 2 (a full fidelity monitor and a snap shot at the beginning and the end via baseline devices) are both already available.

A Device that averages would also fit the Flyer interface (for the less-passive aggressive version of complete), but as currently implemented you would not get those live.

Wonder is something like Derived signal would work here?


class ProcSignal(Signal):
     def __init__(self, sibling_name, time_window, func):
         self.lock = threading.RLock()
         self._buffer = []
         self._func = func
         self.time_window = time_window
         getattr(self.parent, sibling_name).subscribe(self._aggregate)
         self._th = threading.Thread(target=self._thread_worker)
         self._th.start()

     def _aggregate(self, value, **kwargs):
        with self._lock:
            self._buffer.append(value)

     def _thread_worker(self):
         while True:
             time.sleep(self.time_window)
             with self._lock:
                  old_buffer = self._buffer
                  self._buffer = []
             v = self.func(old_buffer)
             # jump to the parent!
             Signal.put(v)

     def put(self, *args, **kwargs):
           raise Exception("no, read only")

That said, if multiple timescales of a PV are useful to bluesky, they are probably useful in other contexts and it would be better to have an IOC that did this averaging that we could just monitor instead.

prjemian commented 5 years ago

+1 ... I agree with Tom and Dan on this

On Thu, Mar 28, 2019, 11:36 PM Thomas A Caswell notifications@github.com wrote:

The first 2 (a full fidelity monitor and a snap shot at the beginning and the end via baseline devices) are both already available.

A Device that averages would also fit the Flyer interface (for the less-passive aggressive version of complete), but as currently implemented you would not get those live.

Wonder is something like Derived signal would work here?

class ProcSignal(Signal): def init(self, sibling_name, time_window, func): self.lock = threading.RLock() self._buffer = [] self._func = func self.time_window = time_window getattr(self.parent, sibling_name).subscribe(self._aggregate) self._th = threading.Thread(target=self._thread_worker) self._th.start()

 def _aggregate(self, value, **kwargs):
    with self._lock:
        self._buffer.append(value)

 def _thread_worker(self):
     while True:
         time.sleep(self.time_window)
         with self._lock:
              old_buffer = self._buffer
              self._buffer = []
         v = self.func(old_buffer)
         # jump to the parent!
         Signal.put(v)

 def put(self, *args, **kwargs):
       raise Exception("no, read only")

That said, if multiple timescales of a PV are useful to bluesky, they are probably useful in other contexts and it would be better to have an IOC that did this averaging that we could just monitor instead.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/NSLS-II/ophyd/issues/706#issuecomment-477796112, or mute the thread https://github.com/notifications/unsubscribe-auth/ACLKMP1UqjWNU_RA5v4n_I4IosX7zSrGks5vbUPUgaJpZM4cRPCD .

danielballan commented 5 years ago

The first 2 (a full fidelity monitor and a snap shot at the beginning and the end via baseline devices) are both already available.

@EliotGann See the section of the bluesky tutorial on supplemental data.

The short of it is:

# Configure all plans to take supplemental readings in addition to 'primary' data stream.
sd.baseline.extend(list_of_motors_to_snapshot_at_start_and_finish)
sd.monitors.extend(list_of_signals_like_ring_current_to_monitor_asynchronously)

This will create a 'baseline' stream and then one additional stream for each monitored signal with a name derived from the name of that signal.

To quote your email to @mrakitin:

change the name of the suitcased files to remove primary etc etc stuff from the name

This is why you might want to retain the word 'primary' in the name---to distinguish from baseline and monitor data streams.

teddyrendahl commented 5 years ago

We have some versions of the averaged signal in our local device repo that is shocking similar to @tacaswell suggestion.

https://github.com/pcdshub/pcdsdevices/blob/master/pcdsdevices/signal.py

In practice it isn't so useful because you have to pick at class creation time which signal you want to average.

EliotGann commented 5 years ago

Thanks! The baseline and monitor streams have basically filled my need at this point, although I think it puts a bit of a burden on rewriting analysis code. With the prevalence of intensity monitor normalization it would make a lot of sense to me experimentally to have a "monitor_average" type of stream which would output at the end of a scan just like the second baseline, but with one number which could be used for normalizing. In my case, the ideal output is an average only while the shutter is open, which is a subset of the scan time. I guess the proper treatment for that case is to read the PV of the shutter status as a second monitor, and then cross that stream with the Izero monitor to average only the values while the shutter is open?

prjemian commented 5 years ago

This is a great case study for how to handle situations like this.

What you described is a data reduction process. It is not obvious that the data reduction provides information that will influence subsequent collection steps. Use Bluesky and ophyd to collect the raw data you need, then apply the steps you described as part of data reduction. Your algorithm for reduction does not have to be burdened with trying to blend its results into collection. With that, you have a much cleaner, sharper interface between collection and processing.

Pete

On Mon, Apr 1, 2019, 10:37 AM EliotGann notifications@github.com wrote:

Thanks! The baseline and monitor streams have basically filled my need at this point, although I think it puts a bit of a burden on rewriting analysis code. With the prevalence of intensity monitor normalization it would make a lot of sense to me experimentally to have a "monitor_average" type of stream which would output at the end of a scan just like the second baseline, but with one number which could be used for normalizing. In my case, the ideal output is an average only while the shutter is open, which is a subset of the scan time. I guess the proper treatment for that case is to read the PV of the shutter status as a second monitor, and then cross that stream with the Izero monitor to average only the values while the shutter is open?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/NSLS-II/ophyd/issues/706#issuecomment-478630747, or mute the thread https://github.com/notifications/unsubscribe-auth/ACLKMHuKYHSWQM7bkz6DYMI2JUysVBI3ks5vcifKgaJpZM4cRPCD .

EliotGann commented 5 years ago

I don't understand how it is just data reduction. Flux is as fundamental to measurements as time, and if they were available for such could really feedback into collection quite naturally. As an example, in the past there have been features on beamlines for "expose for X time, or expose for Y flux". Any experiment should want at least one Izero type value as a single number related to each exposure, so it just seems strange to have it not immediately accessible with the rest of the data.

prjemian commented 5 years ago

The difference is that what you described is computed from raw signals.

On Mon, Apr 1, 2019, 12:04 PM EliotGann notifications@github.com wrote:

I don't understand how it is just data reduction. Flux is as fundamental to measurements as time, and if they were available for such could really feedback into collection quite naturally. As an example, in the past there have been features on beamlines for "expose for X time, or expose for Y flux". Any experiment should want at least one Izero type value as a single number related to each exposure, so it just seems strange to have it not immediately accessible with the rest of the data.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/NSLS-II/ophyd/issues/706#issuecomment-478662435, or mute the thread https://github.com/notifications/unsubscribe-auth/ACLKMP3w45AKWdoe_Ej4wjXy42A22A0jks5vcjwlgaJpZM4cRPCD .

prjemian commented 5 years ago

Thinking through this further, you could construct an ophyd Device that takes, in its constructor, the existing Signals needed to make the computation. Then, in its trigger() (or is it read) method, do the computation. The result would be a (non EPICS) Signal of that device. You would read this Device along with your other Devices and Signals.

I would look at the synthetic gaussian signal for an example.

On Mon, Apr 1, 2019, 12:11 PM Pete Jemian prjemian@gmail.com wrote:

The difference is that what you described is computed from raw signals.

On Mon, Apr 1, 2019, 12:04 PM EliotGann notifications@github.com wrote:

I don't understand how it is just data reduction. Flux is as fundamental to measurements as time, and if they were available for such could really feedback into collection quite naturally. As an example, in the past there have been features on beamlines for "expose for X time, or expose for Y flux". Any experiment should want at least one Izero type value as a single number related to each exposure, so it just seems strange to have it not immediately accessible with the rest of the data.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/NSLS-II/ophyd/issues/706#issuecomment-478662435, or mute the thread https://github.com/notifications/unsubscribe-auth/ACLKMP3w45AKWdoe_Ej4wjXy42A22A0jks5vcjwlgaJpZM4cRPCD .

tacaswell commented 5 years ago

A related class while thinking about this: something that monitors a PV when triggered and watches for it to fire N times


from ophyd import Device, Component as Cpt, EpicsSignal, Signal, DeviceStatus

class AccumulateSignal(Device):
     target = Cpt(EpicsSignal, 'thermo:I')
     window_size = Cpt(Signal, value=5)
     last_read = Cpt(Signal, value=[])     

     def trigger(self):

          dbuffer = []
          count = 0
          target_N = self.window_size.get()
          status = DeviceStatus(self)

          def accumulating_callback(value, **kwargs):
               if status.done:
                    self.target.clear_sub(accumulating_callback)
               nonlocal count

               dbuffer.append(value)
               count += 1

               if count >= target_N:
                    self.last_read.put(dbuffer[:target_N])
                    self.target.clear_sub(accumulating_callback)                         
                    status._finished()

          self.target.subscribe(accumulating_callback, run=False)          

          return status

     def unstage(self):
         self.last_read.put([])

ps = AccumulateSignal(name='ps')
rettigl commented 3 years ago

Hello, I was pointed to this thread in a discussion with @tacaswell while trying to get started with bluesky and getting real detectors integrated in my bluesky lab setup. I really love the flexibility and capabilities of bluesky, but unfortunately had quite a steep learning curve with some frustrating setbacks. I had the feeling that while the tutorials available online nicely and didactically demonstrate the various capabilities of bluesky and the hardware abstraction layer, I was missing a number of important steps to understand how to integrate real hardware, as most of the tutorials rely on a preset simultion configuration, and I did not find some necessary explanation about its setup and configuration, nor the steps to adopt this for real hardware. As an example, a tutorial on how to write such trigger functions in a device class as discussed here would be really helpful, I think, Another example would be an explanation for how in this context the "kind" keywords in the Component affect e.g. reading, scanning and plotting, and how the PV prefix can be used for class instances. Also I was missing from the tutorials further information on the databroker. The tutorials show how to use "temp file" databrokers, and at the same time discurrage using them, but I did not easily find further information on what other options are available.

danielballan commented 3 years ago

Thanks for this specific feedback, @rettigl. It is very helpful. I am in the process of rewriting the ophyd documentation, with an emphasis on how-to guides covering topics such as those you mention. I would be grateful if you would review that work when it is ready 1–2 weeks from now. You are a good representative of its intended audience.

rettigl commented 3 years ago

@danielballan Sure I will be happy to comment.