Closed ambarb closed 8 years ago
I'll stop by today to look at this.
A user initiated a scan to collect 7200.
What kind of scan? Does 7200 refer to fccd.num_images
?
I will try to trace back through the terminal in a bit.
If you can do that, I think it would help.
'86afec'
There seems to be no RunStart document in your metadatastore with this uid. Did the users kill the scan before the RunStart even came out? (Did you get an olog entry?)
Yes, fccd.num_images
. The ^C
during the scan when they reach 5000 images collected. The record is in olog Aquistion. That is why I know the unique id string. I see I forgot to copy the SCAN ID # (65634)
Tried '86afec'
, and same errors. See below from jhub notebook
bgnd8 = DataBroker[65505] #[59111]
bgnd2 = None #DataBroker[59112] #DataBroker['165e034b-83de-4769-b80f-447194e701c5'] #None #DataBroker[59112]
bgnd1 = None # DataBroker[''] #[]
data = DataBroker['86afec'] #[59110]
tbl=get_table(data,fill='False')
tbl #prints a table with all the readable detectors (motor positions at time of measurement)
OUTPUT is a square
stack = get_fastccd_images(data, (bgnd8, bgnd2, bgnd1), tag='fccd_image') #fccd_image_lightfield
images = stack[0]
OUTPUT
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-6-e2261f830d28> in <module>()
----> 1 stack = get_fastccd_images(data, (bgnd8, bgnd2, bgnd1), tag='fccd_image') #fccd_image_lightfield
2 images = stack[0]
/opt/conda_envs/analysis/lib/python3.4/site-packages/csxtools/utils.py in get_fastccd_images(light_header, dark_headers, flat, gain, tag, roi, handler_override)
127
128 events = _get_images(light_header, tag, roi,
--> 129 handler_override=handler_override)
130
131 # Ok, so lets return a pims pipeline which does the image conversion
/opt/conda_envs/analysis/lib/python3.4/site-packages/csxtools/utils.py in _get_images(header, tag, roi, handler_override)
187 def _get_images(header, tag, roi=None, handler_override=None):
188 t = ttime.time()
--> 189 images = get_images(header, tag, handler_override=handler_override)
190 t = ttime.time() - t
191 logger.info("Took %.3f seconds to read data using get_images", t)
/opt/conda_envs/analysis/lib/python3.4/site-packages/databroker/pims_readers.py in get_images(headers, name, handler_registry, handler_override)
29 # do something
30 """
---> 31 return Images(headers, name, handler_registry, handler_override)
32
33
/opt/conda_envs/analysis/lib/python3.4/site-packages/databroker/pims_readers.py in __init__(self, headers, name, handler_registry, handler_override)
56 """
57 events = get_events(headers, [name], fill=False)
---> 58 self._datum_uids = [event.data[name] for event in events
59 if name in event.data]
60 self._len = len(self._datum_uids)
/opt/conda_envs/analysis/lib/python3.4/site-packages/databroker/pims_readers.py in <listcomp>(.0)
56 """
57 events = get_events(headers, [name], fill=False)
---> 58 self._datum_uids = [event.data[name] for event in events
59 if name in event.data]
60 self._len = len(self._datum_uids)
/opt/conda_envs/analysis/lib/python3.4/site-packages/databroker/databroker.py in get_events(headers, fields, fill, handler_registry, handler_overrides)
364 fields = []
365 fields = set(fields)
--> 366 _check_fields_exist(fields, headers)
367
368 for header in headers:
/opt/conda_envs/analysis/lib/python3.4/site-packages/databroker/databroker.py in _check_fields_exist(fields, headers)
641 missing = fields - all_fields
642 if missing:
--> 643 raise ValueError("The fields %r were not found." % missing)
ValueError: The fields {'fccd_image'} were not found.
here is the transcript from the bluesky terminal. Their definition calls RE(ct()). The stopped it when 5000 images were taken. They were worried that they would have trouble with 7200 images because they were running out of memory and were not sure that we could solve the problem for them. That was fixed, but users will often need to stop a scan like this.
In [69]: fccd_set(1,7200)
Current fCCD config: 7200.0 images @ 1.00 Hz = 7200.0 s
In [70]: %run -i ~/Users/2016_06_Porro/2016_06_19_overnight.py
In [71]: mag_setup()
In [72]: mag_reset_meas(1,7200)
Current fCCD config: 7200.0 images @ 1.00 Hz = 7200.0 s
Transient Scan ID: 65634
Persistent Unique Scan ID: '86afecc1-d751-4010-9acb-c4dde3b0570c'
^CA 'deferred pause' has been requested. The RunEngine will pause at the next checkpoint. To pause immediately, hit Ctrl+C again in the next 10 seconds.
Deferred pause acknowledged. Continuing to checkpoint.
stop()
^C^C
Your RunEngine is entering a paused state. These are your options for changing
the state of the RunEngine:
resume() --> will resume the scan
abort() --> will kill the scan with an 'aborted' state to indicate
the scan was interrupted
stop() --> will kill the scan with a 'finished' state to indicate
the scan stopped normally
Pro Tip: Next time, if you want to abort, tap Ctrl+C three times quickly.
Pausing...
OK
In [73]: stop()
Stopping...
^CA 'deferred pause' has been requested. The RunEngine will pause at the next checkpoint. To pause immediately, hit Ctrl+C again in the next 10 seconds.
Deferred pause acknowledged. Continuing to checkpoint.
^C
Your RunEngine is entering a paused state. These are your options for changing
the state of the RunEngine:
resume() --> will resume the scan
abort() --> will kill the scan with an 'aborted' state to indicate
the scan was interrupted
stop() --> will kill the scan with a 'finished' state to indicate
the scan stopped normally
Pro Tip: Next time, if you want to abort, tap Ctrl+C three times quickly.
Pausing...
In [74]: abort()
Aborting....
generator ct ['86afec'] (scan num: 65634)
In [75]: olog('Measurement scan 34 aborted, temperature changed to 300K')
In [76]: fccd_set(1,60)
Current fCCD config: 60.0 images @ 1.00 Hz = 60.0 s
In [77]: ct_all_dark()
Starting procedure to acquire camera gains at 1.0Hz
Camera stopped..
Gain bit set to 2
Camera restarted..
Transient Scan ID: 65635
Persistent Unique Scan ID: '1c775526-f717-448a-9e86-d54070316cbf'
/home/xf23id1/conda_envs/collection/lib/python3.4/site-packages/bluesky/run_engine.py:1821: UserWarning: A ReadTimeout(ReadTimeoutError("HTTPSConnectionPool(host='xf23id1-log.cs.nsls2.local', port=8181): Read timed out. (read timeout=30)",),) was raised during the processing of a start Document. The error will be ignored to avoid interrupting data collection. To investigate, set RunEngine.ignore_callback_exceptions = False and run again.
"and run again." % (exc, name.name))
+-----------+------------+-------------------+-------------------+-------------------+------------+------------+------------+------------+------------+
| seq_num | time | fccd_stats1_total | fccd_stats2_total | fccd_stats3_total | mono_tempa | mono_tempb | grt1_temp | grt2_temp | uw_temp |
+-----------+------------+-------------------+-------------------+-------------------+------------+------------+------------+------------+------------+
| 1 | 13:37:22.3 | 1112157899 | 14795198239 | 14800676775 | 302.749 | 302.749 | 31.6 | 28.8 | 85.12 |
+-----------+------------+-------------------+-------------------+-------------------+------------+------------+------------+------------+------------+
generator ct ['1c7755'] (scan num: 65635)
OK, I understand what happened.
First of all, ignore my earlier comment, "There seems to be no RunStart document in your metadatastore with this uid." That was an unrelated problem I was having.
This explains why tbl
is an empty DataFrame. (This displays as an empty square; using print
we can see a little more clearly what it is....)
In [5]: print(tbl)
Empty DataFrame
Columns: []
Index: []
The raw get_events
function confirms that there is no data from this run in metadatastore --- just the RunStart document.
In [6]: len(list(get_events(h_data)))
Out[6]: 0
The data is likely salvageable. HDF5 file exists, but no record of it was created in metadatastore. I can help you retrieve it tomorrow morning before my group meeting at 10.
To avoid this problem in the future, we should eliminate the long delay between when the detector starts saving data into a file and when we emit the Event document that provides a reference to that data. This was on our radar, but this is the first time I've heard of this particular failure mode.
Sounds great. I reproduced it just now. They have to quit stop()
because that seems ineffective. Thus abort()
is the only option. If it happens again, we will try selecting the done capture button while stuck in the loop that handles stop()
.
Transient Scan ID: 65692
Persistent Unique Scan ID: '94a4d26e-cf2c-45c7-98d9-9b4f7cfecbac'
+-----------+------------+-------------------+-------------------+-------------------+------------+------------+------------+------------+------------+
| seq_num | time | fccd_stats1_total | fccd_stats2_total | fccd_stats3_total | mono_tempa | mono_tempb | grt1_temp | grt2_temp | uw_temp |
+-----------+------------+-------------------+-------------------+-------------------+------------+------------+------------+------------+------------+
| 1 | 15:23:54.3 | -140604 | -1790229 | -1561333 | 302.756 | 302.755 | 29.2 | 28.8 | 85.38 |
+-----------+------------+-------------------+-------------------+-------------------+------------+------------+------------+------------+------------+
generator ct ['94a4d2'] (scan num: 65692)
Returned to initial gain setting and restarting acquisition
Current fCCD config: 7200.0 images @ 1.00 Hz = 7200.0 s
^C---------------------------------------------------------------------------
KeyboardInterrupt Traceback (most recent call last)
/home/xf23id1/Users/2016_06_Porro/2016_06_20_overnight.py in <module>()
----> 1 multi_T(1,7200)
/home/xf23id1/Users/2016_06_Porro/2016_06_20_overnight.py in multi_T(freq, tot_time_1)
90 #mov(temp_sp,tt)
91 caput('XF:23ID1-ES{TCtrl:1-Out:1}T-SP',tt)
---> 92 sleep(3600)
93 RE(ct())
94 olog('Measurement, #{} @ {:.1f} eV, {:.2f} Hz at {:.1f} K for {} images over {:.1f} s'.format((db[-1].get('start').get('scan_id')), pgm_en.get()[1], 1./fccd.acquire_time.get(), temp.a.value,fccd.plugin_num_images.get(),fccd.plugin_num_images.get()*fccd.acquire_time.get()))
KeyboardInterrupt:
In [184]:
KeyboardInterrupt
In [184]:
KeyboardInterrupt
In [184]: stop()
---------------------------------------------------------------------------
TransitionError Traceback (most recent call last)
/home/xf23id1/Users/2016_06_Porro/2016_06_20_overnight.py in <module>()
----> 1 stop()
/home/xf23id1/conda_envs/collection/lib/python3.4/site-packages/bluesky/global_state.py in stop()
166 This function closes over the global RunEngine instance, gs.RE.
167 """
--> 168 return gs.RE.stop()
169
170
/home/xf23id1/conda_envs/collection/lib/python3.4/site-packages/bluesky/run_engine.py in stop(self)
756 """
757 if self.state.is_idle:
--> 758 raise TransitionError("RunEngine is already idle.")
759 print("Stopping...")
760 self._interrupted = True
TransitionError: RunEngine is already idle.
In [185]: grabit
Select area of screen to grab .........
In [186]: fccd_set(1,7200)
Current fCCD config: 7200.0 images @ 1.00 Hz = 7200.0 s
In [187]: %run -i ~/Users/2016_06_Porro/startup.py
In [188]: RE(ct())
Transient Scan ID: 65693
Persistent Unique Scan ID: '44407fa8-9acf-4b8a-bd27-568161906628'
^CA 'deferred pause' has been requested. The RunEngine will pause at the next checkpoint. To pause immediately, hit Ctrl+C again in the next 10 seconds.
Deferred pause acknowledged. Continuing to checkpoint.
^C
Your RunEngine is entering a paused state. These are your options for changing
the state of the RunEngine:
resume() --> will resume the scan
abort() --> will kill the scan with an 'aborted' state to indicate
the scan was interrupted
stop() --> will kill the scan with a 'finished' state to indicate
the scan stopped normally
Pro Tip: Next time, if you want to abort, tap Ctrl+C three times quickly.
Pausing...
Out[188]: ['44407fa8-9acf-4b8a-bd27-568161906628']
In [189]:
In [189]: stop()
Stopping...
1 minute later
^CA 'deferred pause' has been requested. The RunEngine will pause at the next checkpoint. To pause immediately, hit Ctrl+C again in the next 10 seconds.
Deferred pause acknowledged. Continuing to checkpoint.
^C
Your RunEngine is entering a paused state. These are your options for changing
the state of the RunEngine:
resume() --> will resume the scan
abort() --> will kill the scan with an 'aborted' state to indicate
the scan was interrupted
stop() --> will kill the scan with a 'finished' state to indicate
the scan stopped normally
Pro Tip: Next time, if you want to abort, tap Ctrl+C three times quickly.
Pausing...
In [190]: stop()
Stopping...
1 minute later
abort()
^CA 'deferred pause' has been requested. The RunEngine will pause at the next checkpoint. To pause immediately, hit Ctrl+C again in the next 10 seconds.
Deferred pause acknowledged. Continuing to checkpoint.
^C
Your RunEngine is entering a paused state. These are your options for changing
the state of the RunEngine:
resume() --> will resume the scan
abort() --> will kill the scan with an 'aborted' state to indicate
the scan was interrupted
stop() --> will kill the scan with a 'finished' state to indicate
the scan stopped normally
Pro Tip: Next time, if you want to abort, tap Ctrl+C three times quickly.
Pausing...
In [191]:
KeyboardInterrupt
In [191]: abort()
Aborting....
generator ct ['44407f'] (scan num: 65693)
In [192]:
We are working on the underlying issue with stop() as well. Should straighten all this out by the end of the week. Thanks as always for your patience.
Get Outlook for iOShttps://aka.ms/o0ukef
On Mon, Jun 20, 2016 at 5:32 PM -0400, "ambarb" notifications@github.com<mailto:notifications@github.com> wrote:
Sounds great. I reproduced it just now. They have to quit stop() because that seems ineffective. Thus abort() is the only option. If it happens again, we will try selecting the done capture button while stuck in the loop that handles stop().
Transient Scan ID: 65692 Persistent Unique Scan ID: '94a4d26e-cf2c-45c7-98d9-9b4f7cfecbac' +-----------+------------+-------------------+-------------------+-------------------+------------+------------+------------+------------+------------+ | seq_num | time | fccd_stats1_total | fccd_stats2_total | fccd_stats3_total | mono_tempa | mono_tempb | grt1_temp | grt2_temp | uw_temp | +-----------+------------+-------------------+-------------------+-------------------+------------+------------+------------+------------+------------+ | 1 | 15:23:54.3 | -140604 | -1790229 | -1561333 | 302.756 | 302.755 | 29.2 | 28.8 | 85.38 | +-----------+------------+-------------------+-------------------+-------------------+------------+------------+------------+------------+------------+ generator ct ['94a4d2'](scan num: 65692)
Returned to initial gain setting and restarting acquisition
Current fCCD config: 7200.0 images @ 1.00 Hz = 7200.0 s
^C---------------------------------------------------------------------------
KeyboardInterrupt Traceback (most recent call last)
/home/xf23id1/Users/2016_06_Porro/2016_06_20_overnight.py in
/home/xf23id1/Users/2016_06_Porro/2016_06_20_overnight.py in multi_T(freq, tot_time_1) 90 #mov(temp_sp,tt) 91 caput('XF:23ID1-ES{TCtrl:1-Out:1}T-SP',tt) ---> 92 sleep(3600) 93 RE(ct()) 94 olog('Measurement, #{} @ {:.1f} eV, {:.2f} Hz at {:.1f} K for {} images over {:.1f} s'.format((db[-1].get('start').get('scan_id')), pgm_en.get()[1], 1./fccd.acquire_time.get(), temp.a.value,fccd.plugin_num_images.get(),fccd.plugin_num_images.get()*fccd.acquire_time.get()))
KeyboardInterrupt:
In [184]: KeyboardInterrupt
In [184]: KeyboardInterrupt
TransitionError Traceback (most recent call last)
/home/xf23id1/Users/2016_06_Porro/2016_06_20_overnight.py in
/home/xf23id1/conda_envs/collection/lib/python3.4/site-packages/bluesky/global_state.py in stop() 166 This function closes over the global RunEngine instance, gs.RE. 167 """ --> 168 return gs.RE.stop() 169 170
/home/xf23id1/conda_envs/collection/lib/python3.4/site-packages/bluesky/run_engine.py in stop(self) 756 """ 757 if self.state.is_idle: --> 758 raise TransitionError("RunEngine is already idle.") 759 print("Stopping...") 760 self._interrupted = True
TransitionError: RunEngine is already idle.
In [185]: grabit Select area of screen to grab .........
In [186]: fccd_set(1,7200) Current fCCD config: 7200.0 images @ 1.00 Hz = 7200.0 s
In [187]: %run -i ~/Users/2016_06_Porro/startup.py
In [188]: RE(ct()) Transient Scan ID: 65693 Persistent Unique Scan ID: '44407fa8-9acf-4b8a-bd27-568161906628' ^CA 'deferred pause' has been requested. The RunEngine will pause at the next checkpoint. To pause immediately, hit Ctrl+C again in the next 10 seconds. Deferred pause acknowledged. Continuing to checkpoint. ^C Your RunEngine is entering a paused state. These are your options for changing the state of the RunEngine:
resume() --> will resume the scan abort() --> will kill the scan with an 'aborted' state to indicate the scan was interrupted stop() --> will kill the scan with a 'finished' state to indicate the scan stopped normally
Pro Tip: Next time, if you want to abort, tap Ctrl+C three times quickly.
Pausing... Out[188]: ['44407fa8-9acf-4b8a-bd27-568161906628']
In [189]:
In [189]: stop() Stopping...
1 minute later ^CA 'deferred pause' has been requested. The RunEngine will pause at the next checkpoint. To pause immediately, hit Ctrl+C again in the next 10 seconds. Deferred pause acknowledged. Continuing to checkpoint. ^C Your RunEngine is entering a paused state. These are your options for changing the state of the RunEngine:
resume() --> will resume the scan abort() --> will kill the scan with an 'aborted' state to indicate the scan was interrupted stop() --> will kill the scan with a 'finished' state to indicate the scan stopped normally
Pro Tip: Next time, if you want to abort, tap Ctrl+C three times quickly.
Pausing...
In [190]: stop() Stopping...
1 minute later
abort() ^CA 'deferred pause' has been requested. The RunEngine will pause at the next checkpoint. To pause immediately, hit Ctrl+C again in the next 10 seconds. Deferred pause acknowledged. Continuing to checkpoint. ^C Your RunEngine is entering a paused state. These are your options for changing the state of the RunEngine:
resume() --> will resume the scan abort() --> will kill the scan with an 'aborted' state to indicate the scan was interrupted stop() --> will kill the scan with a 'finished' state to indicate the scan stopped normally
Pro Tip: Next time, if you want to abort, tap Ctrl+C three times quickly.
Pausing...
In [191]: KeyboardInterrupt
In [191]: abort() Aborting.... generator ct ['44407f'](scan num: 65693)
In [192]:
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/NSLS-II/Bug-Reports/issues/121#issuecomment-227276046, or mute the threadhttps://github.com/notifications/unsubscribe/ACLIrow8HGOfLTOV2q2KzJ9CvSbsKHHsks5qNwbOgaJpZM4I5yJG.
This has been addressed by
trigger_and_read
which reads before it waits.
@tacaswell @danielballan A user initiated a scan to collect 7200. They then decided to stop the scan. When the command prompt returned, they typed
stop()
but the camera was still writing data. They decided toabort()
. Now they cannot access the data. The user is here until Tuesday morning. Can someone please look into this today - for them to get the data, and then why this happened at your convenience. I will try to trace back through the terminal in a bit.Scan info:
6/19/16, 11:03 am generator ['86afec'](scan num: 65634)