Closed jilavsky closed 2 years ago
Restarted BS, now got similar issue with BlackFly camera. Code continued, but hanged for long time - timeout?
D Sun-15:50:13 - gain=10000000.0 rate: 82360.0 max: 950000 converged=[True, True]
W Sun-15:51:13 - Could not take sample image:path=/share1/USAXS_data/2022-02/02_20_JanTest/02_20_JanTest_saxs, file=AgBehenateLaB6, order#=1, exc=Status(obj=EpicsSignalWithRBV(read_pv='9idFLY2:JPEG1:FileName_RBV', name='blackfly_optical_jpeg1_file_name', parent='blackfly_optical_jpeg1', value='A', timestamp=1645393813.9335833, auto_monitor=True, string=True, write_pv='9idFLY2:JPEG1:FileName', limits=False, put_complete=False), done=True, success=False)
D Sun-15:51:13 - areaDetectorAcquire(): OrderedDict([('enable', 1), ('auto_increment', 'Yes'), ('array_counter', 0), ('auto_save', 'Yes'), ('num_capture', 0), ('file_template', '%s%s_%4.4d.hdf'), ('file_write_mode', 'Single'), ('blocking_callbacks', 'No'), ('parent.cam.array_callbacks', 1)])
Transient Scan ID: 13 Time: 2022-02-20 15:51:14
Persistent Unique Scan ID: '647b993e-bf1b-4324-9bce-6954271be425'
What is it with the AD iocs and BS today?
First item happens during detector staging. I'm starting to form a stronger opinion about staging, that perhaps we use it too much for the way it is implemented. Thinking we could refactor many items out of stage_sigs
and into yield from bps.mv()
so we control exactly when and how often a PV is written. When the RE
stages an ophyd object, it calls the objects ophyd.stage()
method directly, without concern this might take some time to complete. Same for .unstage()
. The code is not obvious about how it handles timeouts, either.
Too tricky by half and we keep getting skewered. Let's address the detector staging issues first since they are most prominent.
Second item (with the blackfly camera) is not obvious what happened. Could not take sample image
lacks some detail. Looking, next, at the log file for that time.
Log file entries for that second:
|2022-02-20 15:50:13.554|DEBUG|43187|bluesky-session|amplifiers|539|bluesky-run-engine| - gain=10000000.0 rate: 82360.0 max: 950000 converged=[True, True]
|2022-02-20 15:50:13.567|DEBUG|43187|bluesky-session|amplifiers|545|bluesky-run-engine| - converged: [True, True]
|2022-02-20 15:51:13.967|WARNING|43187|bluesky-session|sample_imaging|62|bluesky-run-engine| - Could not take sample image:path=/share1/USAXS_data/2022-02/02_20_JanTest/02_20_JanTest_saxs, file=AgBehenateLaB6, order#=1, exc=Status(obj=EpicsSignalWithRBV(read_pv='9idFLY2:JPEG1:FileName_RBV', name='blackfly_optical_jpeg1_file_name', parent='blackfly_optical_jpeg1', value='A', timestamp=1645393813.9335833, auto_monitor=True, string=True, write_pv='9idFLY2:JPEG1:FileName', limits=False, put_complete=False), done=True, success=False)
|2022-02-20 15:51:13.980|DEBUG|43187|bluesky-session|area_detector|34|bluesky-run-engine| - areaDetectorAcquire(): OrderedDict([('enable', 1), ('auto_increment', 'Yes'), ('array_counter', 0), ('auto_save', 'Yes'), ('num_capture', 0), ('file_template', '%s%s_%4.4d.hdf'), ('file_write_mode', 'Single'), ('blocking_callbacks', 'No'), ('parent.cam.array_callbacks', 1)])
This block of code: https://github.com/APS-USAXS/ipython-usaxs/blob/887377a7eadb0d45875dcd1a55a9a57b6aa3cf5b/profile_bluesky/startup/instrument/plans/sample_imaging.py#L50-L52 and this particular step in the .image_prep()
method: https://github.com/APS-USAXS/ipython-usaxs/blob/887377a7eadb0d45875dcd1a55a9a57b6aa3cf5b/profile_bluesky/startup/instrument/devices/blackfly_module.py#L94
You say the code hung for some time, looks like 60 seconds which is the default write timeout that we set: https://github.com/APS-USAXS/ipython-usaxs/blob/887377a7eadb0d45875dcd1a55a9a57b6aa3cf5b/profile_bluesky/startup/instrument/framework/initialize.py#L124-L131
Since this is the sample imaging detector, we have already agreed that this is a value-added but optional item.
And Yes, it is a bad set of problems using BS.
With the decision to refactor from detector.stage_sigs to explicit calls to bps.mv(), I'll close this BUG report issue.
Another timeout issue, same PV on different ioc (this is SAXS, before was WAXS). Will restart BS again.
Instrument will be available later today for testing, if needed. I am trying to collect some test data to prove something, but then it will be available, if needed.