Open jilavsky opened 1 year ago
Observation - we have this state PV not responding routinely, but typically it is non-destructive see below. WHy was the one above destructive Exception?
186 | 09:21:52.5 | 5000000 | 113382 | 16934 |
W Fri-09:21:52 - Exception while reporting instrument state: Status(obj=EpicsSignal(read_pv='9idcLAX:state', name='user_data_state', parent='user_data', value='counting 187/300', timestamp=1668784912.858422, auto_monitor=True, string=True, write_pv='9idcLAX:state', limits=False, put_complete=False), done=True, success=False)
| 187 | 09:21:53.4 | 5000000 | 113632 | 16272 |
| 188 | 09:21:54.2 | 5000000 | 113659 | 15097 |
| 189 | 09:21:54.9 | 5000000 | 113672 | 14403 |
What does this mean?
non-destructive
I'm thinking it may be necessary to avoid using a Status object when setting this PV. To do that, the .put(message)
must be launched in a thread (we do not really care about interrupting the main process for this one).
Launching in thread is a solution, if it does not hang the main data collection thread. The BlackFly in #573 seems to hand main data collection thread and aborting simply resumes the data collection. So I am bit confused on the thread dependence. Reading again the original report, realized that there are two Exceptions : First :
E Wed-20:32:54 - Exception FailedStatus during attempt 1 of 1 of command '('USAXSscan', ['19', '40', '1', 'AirBlank_R'], 76, ' USAXSscan 19 40 1 "AirBlank_R"')''
subject: FailedStatus during attempt 1 of 1 of command '('USAXSscan', ['19', '40', '1', 'AirBlank_R'], 76, ' USAXSscan 19 40 1 "AirBlank_R"')''
Second
exception: Status(obj=EpicsSignal(read_pv='9idcLAX:state', name='user_data_state', parent='user_data', value='counting 75/300', timestamp=1668652373.179653, auto_monitor=True, string=True, write_pv='9idcLAX:state', limits=False, put_complete=False), done=True, success=False)
Stopping further processing of this command list.
I cannot figure out which one is primary or which may be secondary - or simply why did the data collection failed here? As noted, "Exception while reporting instrument state:" does not seem to stop data collection, it seems to be ignored. I see it during data collection relatively routinely and it does not do anything bad, simply generates message. So what was different here and how can we prevent that.
As it turns out, there is a plan method intended to avoid exactly this problem (send the text to the PV but do not return a Status object): https://github.com/APS-USAXS/usaxs-bluesky/blob/a803031b8c806c74949ff53fab1ef4cc3cb2f678/instrument/devices/user_data.py#L104-L113
We call this from many places:
usaxs@usaxscontrol ~/bluesky $ git grep set_state_plan
instrument/devices/usaxs_fly_scan.py: yield from user_data.set_state_plan("writing fly scan HDF5 file")
instrument/devices/usaxs_fly_scan.py: yield from user_data.set_state_plan("fly scan finished")
instrument/devices/user_data.py: def set_state_plan(self, msg, confirm=True):
instrument/plans/mode_changes.py: yield from user_data.set_state_plan(
instrument/plans/mode_changes.py: yield from user_data.set_state_plan(
instrument/plans/mode_changes.py: yield from user_data.set_state_plan("Ready for BlackFly imaging mode")
instrument/plans/mode_changes.py: yield from user_data.set_state_plan("Moving USAXS to USAXS mode")
instrument/plans/mode_changes.py: yield from user_data.set_state_plan("USAXS Mode")
instrument/plans/mode_changes.py: yield from user_data.set_state_plan("Moving USAXS to SAXS mode")
instrument/plans/mode_changes.py: yield from user_data.set_state_plan("SAXS Mode")
instrument/plans/mode_changes.py: yield from user_data.set_state_plan("Moving USAXS to WAXS mode")
instrument/plans/mode_changes.py: yield from user_data.set_state_plan("WAXS Mode")
instrument/plans/mode_changes.py: yield from user_data.set_state_plan("Radiography Mode")
instrument/plans/mode_changes.py: yield from user_data.set_state_plan("Moving USAXS to Imaging mode (same as USAXS mode now)")
instrument/plans/mode_changes.py: yield from user_data.set_state_plan("Moving USAXS to OpenBeamPath mode")
instrument/plans/mode_changes.py: yield from user_data.set_state_plan("USAXS moved to OpenBeamPath mode")
instrument/plans/requested_stop.py: yield from user_data.set_state_plan(msg)
instrument/plans/resets.py: yield from user_data.set_state_plan("resetting motors")
instrument/plans/resets.py: yield from user_data.set_state_plan("USAXS reset complete")
instrument/plans/sample_transmission.py: yield from user_data.set_state_plan("Measure USAXS transmission")
instrument/plans/sample_transmission.py: yield from user_data.set_state_plan("Measure SAXS transmission")
instrument/plans/scans.py: yield from user_data.set_state_plan("Moving to Q=0")
instrument/plans/scans.py: yield from user_data.set_state_plan("Running USAXS step scan")
instrument/plans/scans.py: yield from user_data.set_state_plan("USAXS step scan finished")
instrument/plans/scans.py: yield from user_data.set_state_plan("Moving USAXS back and saving data")
instrument/plans/scans.py: yield from user_data.set_state_plan("Moving to Q=0")
instrument/plans/scans.py: yield from user_data.set_state_plan("Running Flyscan")
instrument/plans/scans.py: yield from user_data.set_state_plan("Flyscan finished")
instrument/plans/scans.py: yield from user_data.set_state_plan("Moving USAXS back and saving data")
instrument/plans/uascan.py: yield from user_data.set_state_plan(f"moving motors {i+1}/{intervals}")
instrument/plans/uascan.py: yield from user_data.set_state_plan(f"counting {i+1}/{intervals}")
instrument/plans/uascan.py: yield from user_data.set_state_plan("returning AR, AX, SY, and DX")
but there are more cases (slightly more tedious search) where it appears we do not use this plan to set the user_data.state
signal:
usaxs@usaxscontrol ~/bluesky $ git grep state | grep -v set_state_plan | grep user_data.state
instrument/devices/usaxs_fly_scan.py: user_data.state._set_thread = None
instrument/plans/command_list.py: user_data.state, "Starting data collection",
instrument/plans/command_list.py: user_data.state, "USAXS macro file done", # exact text triggers the music
instrument/plans/requested_stop.py: user_data.state, "Aborted data collection",
instrument/plans/scans.py: user_data.state, "pre-USAXS optics tune",
instrument/plans/scans.py: user_data.state, "pre-USAXS optics tuning done",
instrument/plans/scans.py: user_data.state, "pre-USAXS optics tune",
instrument/plans/scans.py: user_data.state, "pre-USAXS optics tuning done",
instrument/plans/scans.py: # user_data.state, "pre-SWAXS optics tune",
instrument/plans/scans.py: # user_data.state, "pre-SWAXS optics tuning done",
instrument/plans/scans.py: user_data.state, "starting USAXS step scan",
instrument/plans/scans.py: user_data.state, "starting USAXS Flyscan",
instrument/plans/scans.py: user_data.state, "starting SAXS collection",
instrument/plans/scans.py: user_data.state, f"SAXS collection for {terms.SAXS.acquire_time.get()} s",
instrument/plans/scans.py: user_data.state, "Done SAXS",
instrument/plans/scans.py: user_data.state, "starting WAXS collection",
instrument/plans/scans.py: user_data.state,
instrument/plans/scans.py: user_data.state, "Done WAXS",
instrument/plans/tune_guard_slits.py: user_data.state, "tune Guard slits center",
instrument/plans/tune_guard_slits.py: user_data.state, "autoranging the PD",
instrument/plans/tune_guard_slits.py: yield from bps.mv(user_data.state, title)
instrument/plans/tune_guard_slits.py: yield from bps.mv(user_data.state, "autoranging the PD")
We can probably avoid this problem in the future by examining each of those files and make certain to replace any use of:
yield from bps.mv(user_data.state, some_text)
with
yield from user_data.set_state_plan(some_text)
This might involve separating the posting of a new state message from other signals in the same call to bps.mv()
such as this block: https://github.com/APS-USAXS/usaxs-bluesky/blob/a803031b8c806c74949ff53fab1ef4cc3cb2f678/instrument/plans/command_list.py#L118-L122
Great, so this os basically oversight on our side - we already implemented patch for this epics-BS communication bug (on this specific PV).
Sounds like I need to correct all of these. Looks like simple monkey work. I think this is simple search, separate if needed, and replace, no additional imports needed. Correct?
Hmmm... looking into the actual line of code for the exception reported above: https://github.com/APS-USAXS/usaxs-bluesky/blob/a803031b8c806c74949ff53fab1ef4cc3cb2f678/instrument/plans/uascan.py#L222
Since the status object reports it was trying to use: value='counting 75/300'
, I'm not convinced we have a final answer yet. But it would be great for the code to become consistent, no matter.
Observation - we have this state PV not responding routinely, but typically it is non-destructive see below. WHy was the one above destructive Exception?
Inconsistent use. In some cases, exception is caught and handled (so non-destructive) while in others, the exception bubbles up to the top and ends the plan.
Summary, so far: Probably not solved.
Continue to make all state updates go through user_data.set_state_plan()
. That gives me one place to implement a different handling.
This line might be the troublemaker: https://github.com/APS-USAXS/usaxs-bluesky/blob/a803031b8c806c74949ff53fab1ef4cc3cb2f678/instrument/devices/user_data.py#L108
Default value for confirm=True
A replacement might be like this:
def set_state_plan(self, msg, confirm=True):
"""plan: tell EPICS about what we are doing"""
@apstools.util.run_in_thread
def caput_state(msg):
self.state.put(msg)
yield from bps.null() # makes this method a generator function (a bluesky "plan")
caput_state(trim_string_for_EPICS(msg))
so do the same for the run_in_thread
decorator.
I converted all calls to yield from user_data.set_state_plan(some_text) method. Now we just need to make sure this does not bomb on us when we use it...
After APS beam dump we had failure in access to 9idcLAX:state PV
Not sure this has anything to do with APS beam dump, but happened after suspender properly suspended operations and recovered. Disappointing, all worked before to die this way. Beam dump was around point 45, so we collected another 30 points here.