bluesky / ophyd

hardware abstraction in Python with an emphasis on EPICS
https://blueskyproject.io/ophyd
BSD 3-Clause "New" or "Revised" License
48 stars 77 forks source link

Area detector only sets variables on stage #736

Open CJ-Wright opened 5 years ago

CJ-Wright commented 5 years ago

It seems that Area detector ophyd objects only set certain variables (like the number of images) when the detector is staged and returns them to their previous values upon unstage. This seems to cause a lot of confusion for users, beamline staff, and myself. (see https://github.com/xpdAcq/mission-control/issues/118)

Overall this seems to be a bit of an anti-pattern, since ophyd has tried to keep parity with epics and CS studio (see the discussion around putting the cryostat lookup tables into an ophyd object). From the user's perspective they run a command to change the area detector's configuration and it does nothing on the CS studio screen making them think that somehow there is a disconnect between something in the ipython session and the actual detector. Additionally this makes these variable quite unusual compared to many of the other ophyd devices, where a given variable is set and the action is performed.

At the very least documentation of this particular case would be useful since it seems not obvious the detector would behave this way.

danielballan commented 5 years ago

Per discussion in our Monday meeting, it looks like someone added num_images to stage_sigs and was confused when the num_images was set during staging.

CJ-Wright commented 5 years ago

I think this was coming from some confusion about how the tiff plugin was working see: https://github.com/bluesky/ophyd/blob/5942d63fef5d10eab1769143eab23c302746333e/ophyd/areadetector/filestore_mixins.py#L597-L611

tacaswell commented 5 years ago

This is the squashing tiff plugin which requires coordinating several plugins to get the desired behavior.

sbillinge commented 5 years ago

I wonder if we should revisit this. I think the Ophyd behavior is causing pain and anguish for users because it seems to be doing magic and behaving illogically. Here is an example.

result is panic and confusion because she is sure she set the exposure to 0.1, and she can check the python level and it says that, but CSS is giving weird different behavior.

This is not a hypothetical, we have been seeing multiple issues with user confusion causing anxiety.

Is there a very strong rationale for not sending staging calls to EPICS when they are asked for, rather than when scan is run? If so we can try and mitigate with some text messages explaining it or sthg, but if there is not a strong rationale, I think it would be much better to have not this magik ?

danielballan commented 5 years ago

Staging is a good fit for things like switching the detector into capture mode and back out again. I agree it is not a good fit for setting exposure time. I don't think any other beamlines are configured with behavior like that. Does anyone understand why the behavior was introduced in the first place, what problem it was trying to solve?

I don't see a reference to 'acquire' or 'exposure' in the code snippet from @CJ-Wright above. If that code is responsible for setting the acquire time during staging, I need help understanding how.

sbillinge commented 5 years ago

Sorry Dan, I used exposure and acquire as "proxy" words for my story, the actual things that are set have different names. I am sorry if that caused confusion.

S

On Thu, Jul 11, 2019 at 10:02 AM Dan Allan notifications@github.com wrote:

Staging is a good fit for things like switching the detector into capture mode and back out again. I agree it is not a good fit for setting exposure time. I don't think any other beamlines are configured with behavior like that. Does anyone understand why the behavior was introduced in the first place, what problem it was trying to solve?

I don't see a reference to 'acquire' or 'exposure' in the code snippet from @CJ-Wright https://github.com/CJ-Wright above. If that code is responsible for setting the acquire time during staging, I need help understanding how.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/bluesky/ophyd/issues/736?email_source=notifications&email_token=ABAOWUIBOOS4SRLY6HQXDLLP644PLA5CNFSM4HYLMPW2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZWZQUA#issuecomment-510498896, or mute the thread https://github.com/notifications/unsubscribe-auth/ABAOWUOFD72FAPCHCS6EQ6DP644PLANCNFSM4HYLMPWQ .

CJ-Wright commented 5 years ago

@danielballan images_per_set is part of our total image taking time computation, and so is set in conjunction with the exposure (or acquire time) to meet the user's need.

danielballan commented 5 years ago

No problem. @tacaswell is right that in order to satisfy the “TIFF squashing” requirement we have to ensure that configuration is in a coherent state, satisfying certain constraints. Currently we do it by “fixing” the configuration on stage. Would it be better to raise a clear error if the user tries to run with an incoherent set of parameters?

sbillinge commented 5 years ago

my gut feeling is that raising the error (and preventing execution) with the incoherent parameters would be less confusing (as long as the error messages were clear enough).

S

On Thu, Jul 11, 2019 at 12:48 PM Dan Allan notifications@github.com wrote:

Reopened #736 https://github.com/bluesky/ophyd/issues/736.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/bluesky/ophyd/issues/736?email_source=notifications&email_token=ABAOWUPKTDHLGJOIJWCSZVTP65P43A5CNFSM4HYLMPW2YY3PNVWWK3TUL52HS4DFWZEXG43VMVCXMZLOORHG65DJMZUWGYLUNFXW5KTDN5WW2ZLOORPWSZGOSOO6O7A#event-2476599164, or mute the thread https://github.com/notifications/unsubscribe-auth/ABAOWUMXE6FTBVWAPJFFYK3P65P43ANCNFSM4HYLMPWQ .

CJ-Wright commented 5 years ago

@danielballan does it make sense to error when a signal in stage_sigs is changed after the detector is staged?

danielballan commented 5 years ago

I'm not sure that would always be correct behavior. We have proposed adding a way for ophyd to tell bluesky when a configuration signal has been changed, causing bluesky to make a new EventDescriptor for the next Event, but that's a much larger issue.

If there are no objections then I think the path forward here is to add a check that raises a nice error message on stage if things aren't right.

sbillinge commented 5 years ago

I would support this way forward.

On Thu, Jul 11, 2019 at 4:23 PM Dan Allan notifications@github.com wrote:

I'm not sure that would always be correct behavior. We have proposed adding a way for ophyd to tell bluesky when a configuration signal has been changed, causing bluesky to make a new EventDescriptor for the next Event, but that's a much larger issue.

If there are no objections then I think the path forward here is to add a check that raises a nice error message on stage if things aren't right.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/bluesky/ophyd/issues/736?email_source=notifications&email_token=ABAOWUKWKTMS4JRYLEQITGLP66JCVA5CNFSM4HYLMPW2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZX37BQ#issuecomment-510640006, or mute the thread https://github.com/notifications/unsubscribe-auth/ABAOWUJ6FS6ZVPWVHYLTWDTP66JCVANCNFSM4HYLMPWQ .

tacaswell commented 5 years ago

Sorry Dan, I used exposure and acquire as "proxy" words for my story, the actual things that are set have different names

Using the actual names is extra important here because AD itself has several overlapping settings for "how long is the detector going integrate for" and on top of that the PE detectors at 28 are run in continuous mode which makes setting "the exposure time" require configuring 2 plugins (the processing plugin and then the tiff plugin) and does not have a corresponding PV and hence is not n the CSS screens.

In the standard operating mode at 28 the detector is set to some base frequency in continuous mode. In this arrangement the value of "num_images" on the cam screen is ignored by the IOC (as it is in continuous mode!). We do set it anyway as a side effect of the squashing code trying to be general enough to work for both continuous.

In the process plugin there are a number of settings ( https://github.com/bluesky/ophyd/blob/d3cd9b942d1fe1fd7a77355f95e5b7cfe8bb6289/ophyd/areadetector/filestore_mixins.py#L581-L590 ) that need to be set correctly. The scientifically useful time integration time is det.proc.num_filter.get() * det.cam.acquire_time.get(), which again does not have a PV so can not show up in CSS.

The current system of having a python-side soft-signal for the "useful integration time" and the xpdaq control of the "approximate" integration time is as it is on the explicit request of @sbillinge and @CJ-Wright from a few years ago.

This is also further complicated by the fact that when the PE is running in continuous mode and the acquire_time is changed at the IOC level, the actualy acquire time does not change until you start / stop the detector.

my gut feeling is that raising the error (and preventing execution) with the incoherent parameters would be less confusing (as long as the error messages were clear enough).

The point of stage is to put the detector into a coherent state that can be used for data collection (and unstage puts it back "the way it was"). While it is definitely possible to check and raise if we actually need to change anything, I do not think this is helpful. Given the complexity of the configuration (due to the continuous + squashing) I do not think it is reasonable to expect users to know how to correct the miss-configuration on their own. If the error message is "set these PVs in the CSS screen", then the next question from @sbillinge will be "why can't you just do it as you already know exactly what PVs should be set to what value?" which then puts us back to the current status quo.

If XPD wants to push all of this configuration management down into the IOC that is a discussion that we need to pull others from the controls group into (and move this to a beamline specific forum).

I think this issue should be re-closed with no action.

sbillinge commented 5 years ago

The problem is actually less functionality and more user confusion leading to user panic. If we like the staging logic, let's keep it, but let's look into ways of communicating behavior as best as we can to users, before closing this issue.

To do this, I think we need more specific use-cases from the affected people, to make sure we are solving the right problem. These would be the people who have been calling CJ at all hours, so we may need to reach out to them.

On Fri, Jul 12, 2019 at 12:47 PM Thomas A Caswell notifications@github.com wrote:

Sorry Dan, I used exposure and acquire as "proxy" words for my story, the actual things that are set have different names

Using the actual names is extra important here because AD itself has several overlapping settings for "how long is the detector going integrate for" and on top of that the PE detectors at 28 are run in continuous mode which makes setting "the exposure time" require configuring 2 plugins (the processing plugin and then the tiff plugin) and does not have a corresponding PV and hence is not n the CSS screens.

In the standard operating mode at 28 the detector is set to some base frequency in continuous mode. In this arrangement the value of "num_images" on the cam screen is ignored by the IOC (as it is in continuous mode!). We do set it anyway as a side effect of the squashing code trying to be general enough to work for both continuous.

In the process plugin there are a number of settings ( https://github.com/bluesky/ophyd/blob/d3cd9b942d1fe1fd7a77355f95e5b7cfe8bb6289/ophyd/areadetector/filestore_mixins.py#L581-L590 ) that need to be set correctly. The scientifically useful time integration time is det.proc.num_filter.get() * det.cam.acquire_time.get(), which again does not have a PV so can not show up in CSS.

The current system of having a python-side soft-signal for the "useful integration time" and the xpdaq control of the "approximate" integration time is as it is on the explicit request of @sbillinge https://github.com/sbillinge and @CJ-Wright https://github.com/CJ-Wright from a few years ago.

This is also further complicated by the fact that when the PE is running in continuous mode and the acquire_time is changed at the IOC level, the actualy acquire time does not change until you start / stop the detector.

my gut feeling is that raising the error (and preventing execution) with the incoherent parameters would be less confusing (as long as the error messages were clear enough).

The point of stage is to put the detector into a coherent state that can be used for data collection (and unstage puts it back "the way it was"). While it is definitely possible to check and raise if we actually need to change anything, I do not think this is helpful. Given the complexity of the configuration (due to the continuous + squashing) I do not think it is reasonable to expect users to know how to correct the miss-configuration on their own. If the error message is "set these PVs in the CSS screen", then the next question from @sbillinge https://github.com/sbillinge will be "why can't you just do it as you already know exactly what PVs should be set to what value?" which then puts us back to the current status quo.

If XPD wants to push all of this configuration management down into the IOC that is a discussion that we need to pull others from the controls group into (and move this to a beamline specific forum).

I think this issue should be re-closed with no action.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/bluesky/ophyd/issues/736?email_source=notifications&email_token=ABAOWUOGFWZML4KF3MALXPTP7CYTXA5CNFSM4HYLMPW2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZ2JANY#issuecomment-510955575, or mute the thread https://github.com/notifications/unsubscribe-auth/ABAOWUMXSQAG6G5FSOOBJGDP7CYTXANCNFSM4HYLMPWQ .

prjemian commented 5 years ago

The use of stage() is fundamental to the design of this framework, beyond the scope of area detector support. What @sbillinge is describing is that the documentation is not yet clear enough to resolve his concerns. I use a combination of stage_sigs as well as direct setting of signals in a custom plan to achieve my goals with area detector devices. Tomography is a great example where we wanted images, darks, and flats to all be part of a single areaDetector acquisition sequence. Could not see how to get that with just stage().