aps-8id-dys / ipython-8idiuser

8-ID-I ipython configuration for bluesky (and other)
1 stars 1 forks source link

test continuous scanning - normal data collection #127

Closed prjemian closed 4 years ago

prjemian commented 4 years ago

test that recent updates to ophyd and bluesky are not interrupted by long timeouts or other such.

prjemian commented 4 years ago

related: https://github.com/aps-8id-dys/ipython-8idiuser/issues/124

prjemian commented 4 years ago

Core software to be used

prjemian commented 4 years ago

Note that every conda install of other bluesky packages reverts ophyd back to 1.3.3. To use ophyd 1.4.0rc4, need to run this command in ophyd clone directory:

pip install -e .

The ophyd clone directory is:

cd ~/Documents/jemian/github
git clone https://github.com.bluesky/ophyd.git
cd ophyd
git checkout v1.4.0rc4 -b v1.4.0rc4
prjemian commented 4 years ago

Running into this error:

TimeoutError: Attempted to set EpicsSignalWithRBV(read_pv='8LAMBDA1:IMMout:Capture_RBV', name='lambdadet_immout_capture', parent='lambdadet_immout', value=0, timestamp=1580250531.046254, auto_monitor=False, string=False, write_pv='8LAMBDA1:IMMout:Capture', limits=False, put_complete=False) to value 1 and timed out after 10 seconds. Current value is 0.

Seems to be a permissions problem when the IMMout plugin is trying to create the subdirectory to write the image file. Need to create parent directory with user permissions, not staff.

prjemian commented 4 years ago

renamed the offending directory, made the new one and the test plan ran to completion

prjemian commented 4 years ago

Tested the new Stop before next scan feature, need to change .value to .get() Hrmph. Amend #129

prjemian commented 4 years ago

We are ready to test overnight. Each pas through the loop takes about 10 s. Let's try 10,000 loops and interrupt with our Stop before next scan feature tomorrow morning. Hope for victory.

prjemian commented 4 years ago

Hmmm... our run numbers are A123, A124, ... formatted with three digits. Will this be a problem when the number passes 999?

In [1]: f"{25.0:03.0f}"                                                                                                                                                    
Out[1]: '025'

In [2]: f"A{25.0:03.0f}"                                                                                                                                                   
Out[2]: 'A025'

In [3]: f"A{125.0:03.0f}"                                                                                                                                                  
Out[3]: 'A125'

In [4]: f"A{1125.0:03.0f}"                                                                                                                                                 
Out[4]: 'A1125'

In [5]: f"A{11125.0:03.0f}"                                                                                                                                                
Out[5]: 'A11125'

Nope. Not for python.

prjemian commented 4 years ago
ReadTimeoutError: Failed to read 8idi:Reg3 within 2.0 sec

This was before change 9b7c01a was made. Need to restart bluesky sessions and restart scan. Do not expect to see this with the new default timeout (10 s).

prjemian commented 4 years ago

Well, "overnight" scan stopped after about 40 iterations with an OSError. That's new. More detail is in the terminal session of the VSCode editor.

I'm wondering if that error is due to the plots (plot legend has too many plots) but it's an OSError which is different. We can probably configure the plots better. Another thing for us to learn.

Anyway, running now with a camonitor 8idi:Reg173 (ARun number) in another window. If that pauses, it will show when the bluesky scan stopped.

prjemian commented 4 years ago
Screen Shot 2020-01-28 at 10 00 00 PM
prjemian commented 4 years ago

That exception trace is not informative.

prjemian commented 4 years ago

Ran for about four hours, ending with this (probably the same error):

D Wed-04:16:25 - after count()
HDF5 workflow file name: /home/8-id-i/2020-1/bluesky/A923/A923_0001-0100.hdf
I Wed-04:16:48 - creating HDF5 file /home/8-id-i/2020-1/bluesky/A923/A923_0001-0100.hdf
I Wed-04:16:48 - DM workflow kickoff starting: analysis:True  file:/home/8-id-i/2020-1/bluesky/A923/A923_0001-0100.hdf
I Wed-04:16:48 - self.QMAP_FOLDER_PATH=/home/8-id-i/partitionMapLibrary/2020-1
I Wed-04:16:48 - self.XPCS_QMAP_FILENAME=Lambda_qmap.h5
I Wed-04:16:48 - DM Workflow call is made for XPCS Analysis: /home/8-id-i/2020-1/bluesky/A923/A923_0001-0100.hdf,  /home/8-id-i/partitionMapLibrary/2020-1/Lambda_qmap.h5  ----2020-01-29 04:16:48.816385
Exception in thread Thread-48425:
Traceback (most recent call last):
  File "/home/beams/8IDIUSER/.conda/envs/bluesky/lib/python3.7/threading.py", line 917, in _bootstrap_inner
    self.run()
  File "/home/beams/8IDIUSER/.conda/envs/bluesky/lib/python3.7/threading.py", line 865, in run
    self._target(*self._args, **self._kwargs)
  File "/home/beams/8IDIUSER/.ipython-bluesky/profile_bluesky/startup/50-controls.py", line 1044, in kickoff_DM_workflow
    out, err = dm_workflow.DataAnalysis(hdf_workflow_file)
  File "/home/beams/8IDIUSER/.ipython-bluesky/profile_bluesky/startup/spec_support/APS_DM_8IDI.py", line 701, in DataAnalysis
    return unix(cmd)
  File "/home/beams/8IDIUSER/.ipython-bluesky/profile_bluesky/startup/spec_support/APS_DM_8IDI.py", line 59, in unix
    stderr = subprocess.PIPE,
  File "/home/beams/8IDIUSER/.conda/envs/bluesky/lib/python3.7/subprocess.py", line 775, in __init__
    restore_signals, start_new_session)
  File "/home/beams/8IDIUSER/.conda/envs/bluesky/lib/python3.7/subprocess.py", line 1453, in _execute_child
    restore_signals, start_new_session, preexec_fn)
OSError: [Errno 12] Cannot allocate memory

I Wed-04:16:48 - AD_Acquire starting
I Wed-04:16:48 - file_path = /home/8-id-i/2020-1/bluesky/A924/
I Wed-04:16:48 - Shutter will *REMAIN OPEN THROUGH* the Acquisition...
I Wed-04:16:48 - calling full_acquire_procedure()
D Wed-04:16:48 - before update_metadata_prescan()
I Wed-04:16:48 - detNum=25, det_pars={'ccdHardwareRowSize': 516, 'ccdHardwareColSize': 1556, 'ccdxsense': 1, 'ccdzsense': 1, 'harmonic': 1, 'dpix': 0.055, 'saturation': 4095, 'adupphot': 1.0, 'efficiency': 1.0, 'gain': 1, 'blemish': 1, 'flatfield': 1, 'distortion': 0, 'parasitic': 0, 'lld': 0, 'manufacturer': 'LAMBDA'}
D Wed-04:16:49 - after update_metadata_prescan()
I Wed-04:16:49 - scaler should be autocounting now
D Wed-04:16:49 - before count()
Transient Scan ID: 1295     Time: 2020/01/29 04:16:49
Persistent Unique Scan ID: 'f42ebd10-a5a8-4f25-aa0c-e948c2bbae1c'
New stream: 'baseline'
Start-of-run baseline readings:
+--------------------------------+--------------------------------+
+--------------------------------+--------------------------------+
New stream: 'Timebase_monitor'
New stream: 'pind1_monitor'
New stream: 'pind2_monitor'
New stream: 'Atten1_monitor'
New stream: 'Atten2_monitor'
New stream: 'lakeshore_loop1_temperature_monitor'
New stream: 'lakeshore_loop1_target_monitor'
I Wed-04:17:09 - full_name: /home/8-id-i/2020-1/bluesky/A924/A924_00001-00100.imm
closure() method endss. No progress bar available.]
cam.acquire.value=0
immout.capture.value=0
immout.num_captured.value=100
status=DeviceStatus(device=lambdadet, done=True, success=True)
New stream: 'primary'
+-----------+------------+
|   seq_num |       time |
+-----------+------------+
|         1 | 04:17:21.1 |
+-----------+------------+
generator lambda_test ['f42ebd10'] (scan num: 1295)
End-of-run baseline readings:
+--------------------------------+--------------------------------+
+--------------------------------+--------------------------------+

D Wed-04:17:24 - after count()
/home/beams/8IDIUSER/bin/bluesky8IDI: line 40: 21128 Killed                  ipython --profile=${IPYTHON_PROFILE} --ipython-dir=${IPYTHON_DIR} --IPCompleter.use_jedi=False

Focus on this part:

  File "/home/beams/8IDIUSER/.ipython-bluesky/profile_bluesky/startup/spec_support/APS_DM_8IDI.py", line 59, in unix
    stderr = subprocess.PIPE,

which ends in

OSError: [Errno 12] Cannot allocate memory

Looks like it is a memory leak in the code which calls subprocess.

prjemian commented 4 years ago

I'll call success (no interruptions due to unplanned timeouts or stalls) for the bluesky aspect but not success for the DM workflow as called from the bluesky code.

prjemian commented 4 years ago

Can disable BestEffortCallback (LivePlot) plots with:

    bec.disable_plots()

and re-enable at end of plan with

    bec.enable_plots()
prjemian commented 4 years ago

Running again (restart session) without changing code in spec_support/APS_DM_8IDI.py

prjemian commented 4 years ago

Note from above OSError, that the bluesky session was Killed.

prjemian commented 4 years ago

previous plan ended at 4:16 AM, new plan started at 8:08 AM, here is output from camonitor 8idi:Reg173 (changes in ARun number PV)

8idi:Reg173                    2020-01-29 04:16:48.839393 924
8idi:Reg173                    2020-01-29 08:08:28.680734 925
8idi:Reg173                    2020-01-29 08:08:39.697400 926
prjemian commented 4 years ago

Stopped with

TimeoutError: Attempted to set EpicsSignalWithRBV(read_pv='8LAMBDA1:IMMout:Capture_RBV', name='lambdadet_immout_capture', parent='lambdadet_immout', value=0, timestamp=1580307915.07393, auto_monitor=False, string=False, write_pv='8LAMBDA1:IMMout:Capture', limits=False, put_complete=False) to value 1 and timed out after 10 seconds. Current value is 0.

at this ARun number:

8idi:Reg173                    2020-01-29 08:25:14.848393 1016
8idi:Reg173                    2020-01-29 08:25:15.398393 1017
prjemian commented 4 years ago

Something wrong with the Lambda now?

prjemian commented 4 years ago

One short term workaround (for the OSError) is to use the same API as we use for SPEC, set the signal PV and wait for it to be reset.

prjemian commented 4 years ago

For today, implement the workaround. Gives time to look over that for why the OSError exception.

sureshnaps commented 4 years ago

Pete, Not sure with lambda, it was stuck in the collecting mode, ioc was still alive.

lambda% echo $LAMBDA_IOC_PATH /local/epics/synApps_5_8/support/areaDetector-R3-2/ADLambda/iocs/LambdaIOC/iocBoot/iocLambda/ lambda%

autosave is in the above folder. The file below seems to have the right permissions.

save_restore:myFileCopy: Can't open file './autosave/auto_settings.sav0' save_restore:do_seq: Wrote seq. file from PV list. [200129-101043] save_restore:do_seq - Can't copy save file to './autosave/auto_settings.sav1' [200129-101143] save_restore:myFileCopy: Can't open file './autosave/auto_settings.sav1' save_restore:do_seq: Wrote seq. file from PV list. [200129-101143] save_restore:do_seq - Can't copy save file to './autosave/auto_settings.sav2' [200129-101243] save_restore:myFileCopy: Can't open file './autosave/auto_settings.sav2' save_restore:do_seq: Wrote seq. file from PV list. [200129-101243] save_restore:do_seq - Can't copy save file to './autosave/auto_settings.sav0' [200129-101343] save_restore:myFileCopy: Can't open file './autosave/auto_settings.sav0' save_restore:do_seq: Wrote seq. file from PV list. [200129-101343]

From: Pete R Jemian notifications@github.com Reply-To: aps-8id-dys/ipython-8idiuser reply@reply.github.com Date: Wednesday, January 29, 2020 at 9:12 AM To: aps-8id-dys/ipython-8idiuser ipython-8idiuser@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: Re: [aps-8id-dys/ipython-8idiuser] test continuous scanning - normal data collection (#127)

Something wrong with the Lambda now?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/aps-8id-dys/ipython-8idiuser/issues/127?email_source=notifications&email_token=AAYSAQ26XV44VGJNFS4OBO3RAGL6HA5CNFSM4KMXX742YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKHQUBA#issuecomment-579799556, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAYSAQ4NXT6BCVVAPTHLTPLRAGL6HANCNFSM4KMXX74Q.

prjemian commented 4 years ago

@sureshnaps says

it was stuck in the collecting mode, ioc was still alive

Seems there is still something to resolve with the lambda but it is a transient problem. Tougher to force it to happen.

prjemian commented 4 years ago

after e256fc9 in #130, restarted:

(bluesky) 8idiuser@quartz ~/local_macros $ camonitor 8idi:Reg173
8idi:Reg173                    2020-01-29 11:37:34.697587 1019  

but it stopped right away with

TimeoutError: Attempted to set EpicsSignalWithRBV(read_pv='8LAMBDA1:IMMout:Capture_RBV', name='lambdadet_immout_capture', parent='lambdadet_immout', value=0, timestamp=1580319454.297906, auto_monitor=False, string=False, write_pv='8LAMBDA1:IMMout:Capture', limits=False, put_complete=False) to value 1 and timed out after 10 seconds. Current value is 0.
prjemian commented 4 years ago

lambda detector needs some attention, cam plugin has Acquire=1, IMMout plugin has Capture=0

prjemian commented 4 years ago

Cam plugin says no (0) images complete. Looking at the lambda IOC's console and find this trouble:

2020/01/29 11:43:56.532 Lambda:writeInt32 Setting TriggerMode 0
2020/01/29 11:43:56.534 Lambda:writeInt32 Setting TriggerMode 0
2020/01/29 11:43:56.535 Entering Lambda:writeFloat64
2020/01/29 11:43:56.537 Entering Lambda:writeFloat64
Setting Acquire Period
mkdir umask 777 chmod ret 0
 Created Dir /home/8-id-i/2020-1/bluesky/A1020
IMM Recurse path: statis = 0
Error- Could not open IMM File
2020/01/29 11:43:56.784 NDPluginFile::openFileBase Error opening file /home/8-id-i/2020-1/bluesky/A1020/A1020.imm, status=3
save_restore:write_save_file: Backup file (./autosave/auto_settings.savB) bad or not found.  Writing a new one. [200129-114356]
save_restore:write_it - unable to open file './autosave/auto_settings.savB' [200129-114356]
*** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
save_restore:write_save_file: Can't write new backup file. [200129-114356]
../save_restore.c(1704): [0x18]=write_it:Too many open files

Will an IOC reboot fix this?

prjemian commented 4 years ago

repeated that same message from autosave periodically

prjemian commented 4 years ago

rebooting lambdadet IOC, then wlil prime the plugins with one image so bluesky can resume (restarting the bluesky session)

prjemian commented 4 years ago

after IOC is rebooted, restart scanning...

(bluesky) 8idiuser@quartz ~/local_macros $ camonitor 8idi:Reg173
8idi:Reg173                    2020-01-29 12:03:18.841498 1022  
8idi:Reg173                    2020-01-29 12:03:30.024831 1023  
prjemian commented 4 years ago

checkpoint, still scanning, ...

8idi:Reg173                    2020-01-29 12:28:01.483096 1156  
8idi:Reg173                    2020-01-29 12:28:12.501431 1157  
8idi:Reg173                    2020-01-29 12:28:23.434764 1158  
8idi:Reg173                    2020-01-29 12:28:34.451430 1159  
8idi:Reg173                    2020-01-29 12:28:45.534763 1160  
prjemian commented 4 years ago

re: https://github.com/aps-8id-dys/ipython-8idiuser/issues/127#issuecomment-579880616

Not finding the text Too many open files in EPICS synApps autosave repo. Might have been removed some time ago. Need to check EPICS version of IOC code. EPICS 3.14.?.? and what version of autosave corresponds?

timmmooney commented 4 years ago

"Too many open files" is the string corresponding to errno after the file write failed. It's printed by this line in dbrestore.c: errlogPrintf("%s(%d): [0x%x]=%s:%s\n", file, line, errno, s, strerror(errno));

I don't have enough information to guess why there might have been too many open files.

prjemian commented 4 years ago

checkpoint at 15:35, scan stopped a couple hours ago. Last update to ARun number:

8idi:Reg173                    2020-01-29 13:22:31.640474 1452
prjemian commented 4 years ago

(edited by PRJ)

OSError:

OSError: Unable to create file (unable to open file: name = '/home/8-id-i/2020-1/bluesky/A1452/A1452_0001-0100.hdf', errno = 2, error message = 'No such file or directory', flags = 15, o_flags = c2)
full exception trace

``` D Wed-13:22:31 - after count() HDF5 workflow file name: /home/8-id-i/2020-1/bluesky/A1451/A1451_0001-0100.hdf I Wed-13:22:31 - creating HDF5 file /home/8-id-i/2020-1/bluesky/A1451/A1451_0001-0100.hdf I Wed-13:22:31 - DM workflow kickoff starting: analysis:True file:/home/8-id-i/2020-1/bluesky/A1451/A1451_0001-0100.hdf I Wed-13:22:31 - self.QMAP_FOLDER_PATH=/home/8-id-i/partitionMapLibrary/2020-1 I Wed-13:22:31 - self.XPCS_QMAP_FILENAME=Lambda_qmap.h5 I Wed-13:22:31 - DM Workflow call is made for XPCS Analysis: /home/8-id-i/2020-1/bluesky/A1451/A1451_0001-0100.hdf, /home/8-id-i/partitionMapLibrary/2020-1/Lambda_qmap.h5 ----2020-01-29 13:22:31.630780 I Wed-13:22:31 - AD_Acquire starting I Wed-13:22:31 - file_path = /home/8-id-i/2020-1/bluesky/A1452/ I Wed-13:22:31 - Shutter will *REMAIN OPEN THROUGH* the Acquisition... I Wed-13:22:31 - calling full_acquire_procedure() D Wed-13:22:31 - before update_metadata_prescan() I Wed-13:22:31 - detNum=25, det_pars={'ccdHardwareRowSize': 516, 'ccdHardwareColSize': 1556, 'ccdxsense': 1, 'ccdzsense': 1, 'harmonic': 1, 'dpix': 0.055, 'saturation': 4095, 'adupphot': 1.0, 'efficiency': 1.0, 'gain': 1, 'blemish': 1, 'flatfield': 1, 'distortion': 0, 'parasitic': 0, 'lld': 0, 'manufacturer': 'LAMBDA'} I Wed-13:22:31 - DM workflow kickoff done I Wed-13:22:31 - b'id=2306050c-04c3-46fb-b215-f46995d2e3fc owner=8idiuser status=pending startTime=1580325751.81 startTimestamp=2020/01/29 13:22:31 CST \n' D Wed-13:22:32 - after update_metadata_prescan() I Wed-13:22:32 - scaler should be autocounting now D Wed-13:22:32 - before count() Transient Scan ID: 1823 Time: 2020/01/29 13:22:32 Persistent Unique Scan ID: '7fd1680f-e01c-4538-b6b5-3a7a7fa88573' New stream: 'baseline' Start-of-run baseline readings: +--------------------------------+--------------------------------+ +--------------------------------+--------------------------------+ New stream: 'Timebase_monitor' New stream: 'pind1_monitor' New stream: 'pind2_monitor' New stream: 'Atten1_monitor' New stream: 'Atten2_monitor' New stream: 'lakeshore_loop1_temperature_monitor' New stream: 'lakeshore_loop1_target_monitor' I Wed-13:22:32 - full_name: /home/8-id-i/2020-1/bluesky/A1452/A1452_00001-00100.imm closure() method endss. No progress bar available.] cam.acquire.value=0 immout.capture.value=0 immout.num_captured.value=100 status=DeviceStatus(device=lambdadet, done=True, success=True) New stream: 'primary' +-----------+------------+ | seq_num | time | +-----------+------------+ | 1 | 13:22:43.3 | +-----------+------------+ generator lambda_test ['7fd1680f'] (scan num: 1823) End-of-run baseline readings: +--------------------------------+--------------------------------+ +--------------------------------+--------------------------------+ D Wed-13:22:43 - after count() HDF5 workflow file name: /home/8-id-i/2020-1/bluesky/A1452/A1452_0001-0100.hdf I Wed-13:22:43 - creating HDF5 file /home/8-id-i/2020-1/bluesky/A1452/A1452_0001-0100.hdf --------------------------------------------------------------------------- OSError Traceback (most recent call last) in ----> 1 RE(lambda_test(10000)) ~/.conda/envs/bluesky/lib/python3.7/site-packages/bluesky/run_engine.py in __call__(self, *args, **metadata_kw) 742 # it (unless it is a canceled error) 743 if exc is not None: --> 744 raise exc 745 746 if self._interrupted: ~/.conda/envs/bluesky/lib/python3.7/site-packages/bluesky/run_engine.py in _run(self) 1251 self.log.error("Run aborted") 1252 self.log.error("%r", err) -> 1253 raise err 1254 finally: 1255 # Some done_callbacks may still be alive in other threads. ~/.conda/envs/bluesky/lib/python3.7/site-packages/bluesky/run_engine.py in _run(self) 1138 else: 1139 try: -> 1140 msg = self._plan_stack[-1].send(resp) 1141 # We have exhausted the top generator 1142 except StopIteration: ~/.conda/envs/bluesky/lib/python3.7/site-packages/bluesky/preprocessors.py in __call__(self, plan) 1292 plan = monitor_during_wrapper(plan, self.monitors) 1293 plan = baseline_wrapper(plan, self.baseline) -> 1294 return (yield from plan) ~/.conda/envs/bluesky/lib/python3.7/site-packages/bluesky/preprocessors.py in baseline_wrapper(plan, devices, name) 1147 return (yield from plan) 1148 else: -> 1149 return (yield from plan_mutator(plan, insert_baseline)) 1150 1151 ~/.conda/envs/bluesky/lib/python3.7/site-packages/bluesky/preprocessors.py in plan_mutator(plan, msg_proc) 167 continue 168 else: --> 169 raise ex 170 # if inserting / mutating, put new generator on the stack 171 # and replace the current msg with the first element from the ~/.conda/envs/bluesky/lib/python3.7/site-packages/bluesky/preprocessors.py in plan_mutator(plan, msg_proc) 120 ret = result_stack.pop() 121 try: --> 122 msg = plan_stack[-1].send(ret) 123 except StopIteration as e: 124 # discard the exhausted generator ~/.conda/envs/bluesky/lib/python3.7/site-packages/bluesky/preprocessors.py in monitor_during_wrapper(plan, signals) 788 plan1 = plan_mutator(plan, insert_after_open) 789 plan2 = plan_mutator(plan1, insert_before_close) --> 790 return (yield from plan2) 791 792 ~/.conda/envs/bluesky/lib/python3.7/site-packages/bluesky/preprocessors.py in plan_mutator(plan, msg_proc) 167 continue 168 else: --> 169 raise ex 170 # if inserting / mutating, put new generator on the stack 171 # and replace the current msg with the first element from the ~/.conda/envs/bluesky/lib/python3.7/site-packages/bluesky/preprocessors.py in plan_mutator(plan, msg_proc) 120 ret = result_stack.pop() 121 try: --> 122 msg = plan_stack[-1].send(ret) 123 except StopIteration as e: 124 # discard the exhausted generator ~/.conda/envs/bluesky/lib/python3.7/site-packages/bluesky/preprocessors.py in plan_mutator(plan, msg_proc) 167 continue 168 else: --> 169 raise ex 170 # if inserting / mutating, put new generator on the stack 171 # and replace the current msg with the first element from the ~/.conda/envs/bluesky/lib/python3.7/site-packages/bluesky/preprocessors.py in plan_mutator(plan, msg_proc) 120 ret = result_stack.pop() 121 try: --> 122 msg = plan_stack[-1].send(ret) 123 except StopIteration as e: 124 # discard the exhausted generator ~/.conda/envs/bluesky/lib/python3.7/site-packages/bluesky/preprocessors.py in fly_during_wrapper(plan, flyers) 846 plan1 = plan_mutator(plan, insert_after_open) 847 plan2 = plan_mutator(plan1, insert_before_close) --> 848 return (yield from plan2) 849 850 ~/.conda/envs/bluesky/lib/python3.7/site-packages/bluesky/preprocessors.py in plan_mutator(plan, msg_proc) 167 continue 168 else: --> 169 raise ex 170 # if inserting / mutating, put new generator on the stack 171 # and replace the current msg with the first element from the ~/.conda/envs/bluesky/lib/python3.7/site-packages/bluesky/preprocessors.py in plan_mutator(plan, msg_proc) 120 ret = result_stack.pop() 121 try: --> 122 msg = plan_stack[-1].send(ret) 123 except StopIteration as e: 124 # discard the exhausted generator ~/.conda/envs/bluesky/lib/python3.7/site-packages/bluesky/preprocessors.py in plan_mutator(plan, msg_proc) 167 continue 168 else: --> 169 raise ex 170 # if inserting / mutating, put new generator on the stack 171 # and replace the current msg with the first element from the ~/.conda/envs/bluesky/lib/python3.7/site-packages/bluesky/preprocessors.py in plan_mutator(plan, msg_proc) 120 ret = result_stack.pop() 121 try: --> 122 msg = plan_stack[-1].send(ret) 123 except StopIteration as e: 124 # discard the exhausted generator ~/.ipython-bluesky/user/lambda_testing.py in lambda_test(num_iter) 25 num_images=100, file_name=file_name, 26 submit_xpcs_job=True, ---> 27 atten=None, path=None) 28 29 bec.enable_plots() ~/.ipython-bluesky/profile_bluesky/startup/50-controls.py in AD_Acquire(areadet, acquire_time, acquire_period, num_images, file_name, submit_xpcs_job, atten, path) 1051 1052 logger.info("calling full_acquire_procedure()") -> 1053 return (yield from full_acquire_procedure()) ~/.conda/envs/bluesky/lib/python3.7/site-packages/bluesky/utils.py in dec_inner(*inner_args, **inner_kwargs) 1011 plan = gen_func(*inner_args, **inner_kwargs) 1012 plan = wrapper(plan, *args, **kwargs) -> 1013 return (yield from plan) 1014 return dec_inner 1015 return dec ~/.conda/envs/bluesky/lib/python3.7/site-packages/bluesky/preprocessors.py in stage_wrapper(plan, devices) 937 return (yield from plan) 938 --> 939 return (yield from finalize_wrapper(inner(), unstage_devices())) 940 941 ~/.conda/envs/bluesky/lib/python3.7/site-packages/bluesky/preprocessors.py in finalize_wrapper(plan, final_plan, pause_for_debug) 498 cleanup = True 499 try: --> 500 ret = yield from plan 501 except GeneratorExit: 502 cleanup = False ~/.conda/envs/bluesky/lib/python3.7/site-packages/bluesky/preprocessors.py in inner() 935 def inner(): 936 yield from stage_devices() --> 937 return (yield from plan) 938 939 return (yield from finalize_wrapper(inner(), unstage_devices())) ~/.conda/envs/bluesky/lib/python3.7/site-packages/bluesky/utils.py in dec_inner(*inner_args, **inner_kwargs) 1011 plan = gen_func(*inner_args, **inner_kwargs) 1012 plan = wrapper(plan, *args, **kwargs) -> 1013 return (yield from plan) 1014 return dec_inner 1015 return dec ~/.conda/envs/bluesky/lib/python3.7/site-packages/bluesky/preprocessors.py in monitor_during_wrapper(plan, signals) 788 plan1 = plan_mutator(plan, insert_after_open) 789 plan2 = plan_mutator(plan1, insert_before_close) --> 790 return (yield from plan2) 791 792 ~/.conda/envs/bluesky/lib/python3.7/site-packages/bluesky/preprocessors.py in plan_mutator(plan, msg_proc) 167 continue 168 else: --> 169 raise ex 170 # if inserting / mutating, put new generator on the stack 171 # and replace the current msg with the first element from the ~/.conda/envs/bluesky/lib/python3.7/site-packages/bluesky/preprocessors.py in plan_mutator(plan, msg_proc) 120 ret = result_stack.pop() 121 try: --> 122 msg = plan_stack[-1].send(ret) 123 except StopIteration as e: 124 # discard the exhausted generator ~/.conda/envs/bluesky/lib/python3.7/site-packages/bluesky/preprocessors.py in plan_mutator(plan, msg_proc) 167 continue 168 else: --> 169 raise ex 170 # if inserting / mutating, put new generator on the stack 171 # and replace the current msg with the first element from the ~/.conda/envs/bluesky/lib/python3.7/site-packages/bluesky/preprocessors.py in plan_mutator(plan, msg_proc) 120 ret = result_stack.pop() 121 try: --> 122 msg = plan_stack[-1].send(ret) 123 except StopIteration as e: 124 # discard the exhausted generator ~/.ipython-bluesky/profile_bluesky/startup/50-controls.py in full_acquire_procedure() 1029 print(f"HDF5 workflow file name: {hdf_with_fullpath}") 1030 -> 1031 dm_workflow.create_hdf5_file(hdf_with_fullpath) 1032 1033 # update these str values from the string registers ~/.ipython-bluesky/profile_bluesky/startup/spec_support/APS_DM_8IDI.py in create_hdf5_file(self, filename, **kwargs) 300 301 # any exception here will be handled by caller --> 302 with h5py.File(filename, "w-") as f: 303 # get a version number so we can make changes without breaking client code 304 f.create_dataset("/hdf_metadata_version", ~/.conda/envs/bluesky/lib/python3.7/site-packages/h5py/_hl/files.py in __init__(self, name, mode, driver, libver, userblock_size, swmr, rdcc_nslots, rdcc_nbytes, rdcc_w0, track_order, **kwds) 392 fid = make_fid(name, mode, userblock_size, 393 fapl, fcpl=make_fcpl(track_order=track_order), --> 394 swmr=swmr) 395 396 if swmr_support: ~/.conda/envs/bluesky/lib/python3.7/site-packages/h5py/_hl/files.py in make_fid(name, mode, userblock_size, fapl, fcpl, swmr) 172 fid = h5f.open(name, h5f.ACC_RDWR, fapl=fapl) 173 elif mode in ['w-', 'x']: --> 174 fid = h5f.create(name, h5f.ACC_EXCL, fapl=fapl, fcpl=fcpl) 175 elif mode == 'w': 176 fid = h5f.create(name, h5f.ACC_TRUNC, fapl=fapl, fcpl=fcpl) h5py/_objects.pyx in h5py._objects.with_phil.wrapper() h5py/_objects.pyx in h5py._objects.with_phil.wrapper() h5py/h5f.pyx in h5py.h5f.create() OSError: Unable to create file (unable to open file: name = '/home/8-id-i/2020-1/bluesky/A1452/A1452_0001-0100.hdf', errno = 2, error message = 'No such file or directory', flags = 15, o_flags = c2) In [3]: Unexpected problem with CA circuit to server "ioc8idd1.xray.aps.anl.gov:5064" was "Connection reset by peer" - disconnecting CA.Client.Exception............................................... Warning: "Virtual circuit disconnect" Context: "ioc8idd1.xray.aps.anl.gov:5064" Source File: ../cac.cpp line 1225 Current Time: Wed Jan 29 2020 14:49:04.490242420 .................................................................. ```

prjemian commented 4 years ago

The previous comment was a cut-and-paste by QZ from quartz (8-ID-I control). Both output from A1451 and A1452 were copied. Scan was successful at A1451 but failed at A1452.

prjemian commented 4 years ago

This is similar to the OSError of #130 which was resolved by refactoring the call to subprocess within a context.

But this is different.

prjemian commented 4 years ago

This file or directory not found. Let's test that.

8idiuser@quartz ~ $ ls /home/8-id-i/2020-1/bluesky/A1452/A1452_0001-0100.hdf
ls: cannot access /home/8-id-i/2020-1/bluesky/A1452/A1452_0001-0100.hdf: No such file or directory
8idiuser@quartz ~ $ ls /home/8-id-i/2020-1/bluesky/A1452/
A1452_00001-00100.imm*
8idiuser@quartz ~ $ df -HT
Filesystem                   Type      Size  Used Avail Use% Mounted on
devtmpfs                     devtmpfs   17G     0   17G   0% /dev
tmpfs                        tmpfs      17G  139M   17G   1% /dev/shm
tmpfs                        tmpfs      17G  2.2M   17G   1% /run
tmpfs                        tmpfs      17G     0   17G   0% /sys/fs/cgroup
/dev/mapper/vg_root-lv_root  xfs        54G   31G   24G  57% /
/dev/nvme0n1p2               xfs       1.1G  406M  659M  39% /boot
/dev/nvme0n1p1               vfat      210M   11M  200M   5% /boot/efi
/dev/mapper/vg_root-lv_local xfs       449G   34G  415G   8% /local
s8iddserv:/usr/local-linux   nfs4      159G  128G   23G  86% /usr/local
tmpfs                        tmpfs     3.4G  4.1k  3.4G   1% /run/user/42
tmpfs                        tmpfs     3.4G  406k  3.4G   1% /run/user/2175
s8iddserv:/export/beams      nfs4       40G   22G   19G  55% /home/beams
s8iddserv:/export/beams10    nfs4      1.1T  431G  597G  42% /home/beams10
s8iddserv:/APSshare          nfs4      317G  208G   95G  69% /net/s8iddserv/APSshare
s8iddserv:/xorApps           nfs4       27G   23G  2.5G  91% /net/s8iddserv/xorApps
s8iddserv:/export/dm         nfs4      106G   14G   87G  14% /home/dm
s8iddata:/export/8-id-i      nfs4       23T   19T  4.9T  79% /home/8-id-i

Right. The file is not there. Why is that? The file system is not full (4.9TB free on /home/8-id-i, plenty). This is the program flow (from the exception trace):

D Wed-13:22:43 - after count()
HDF5 workflow file name: /home/8-id-i/2020-1/bluesky/A1452/A1452_0001-0100.hdf
I Wed-13:22:43 - creating HDF5 file /home/8-id-i/2020-1/bluesky/A1452/A1452_0001-0100.hdf
---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
-> 1031         dm_workflow.create_hdf5_file(hdf_with_fullpath)
--> 302         with h5py.File(filename, "w-") as f:
prjemian commented 4 years ago

The program stopped when the workflow file could not be created by h5py, stopped by the OSError. @sureshnaps - How should this be handled?

prjemian commented 4 years ago

Meanwhile, repeating the plan. Here is the first acquisition:

I Wed-22:18:50 - full_name: /home/8-id-i/2020-1/bluesky/A1455/A1455_00001-00100.imm
sureshnaps commented 4 years ago

Hi Pete,

Just occurred to me. There is a bug in the lambda IOC IMM plugin file which occurred due to a confusion between umask and permissions. IMM plugin changes the umask to 777 after writing .imm instead of doing chmod 777. Umask of 777 means rwx of 000 and no other file can be written to that directory.

Our normal ops works because we write .imm to /data and not to /home/8-id-I and .hdf is written to 8-id-i. So I would suggest that you resume the ops by writing to /data/2020-1/bl** and that should work.

Suresh

From: Pete R Jemian notifications@github.com Reply-To: aps-8id-dys/ipython-8idiuser reply@reply.github.com Date: Wednesday, January 29, 2020 at 10:16 PM To: aps-8id-dys/ipython-8idiuser ipython-8idiuser@noreply.github.com Cc: "Narayanan, Suresh" sureshn@anl.gov, Mention mention@noreply.github.com Subject: Re: [aps-8id-dys/ipython-8idiuser] test continuous scanning - normal data collection (#127)

The program stopped when the workflow file could not be created by h5py, stopped by the OSError. @sureshnapshttps://github.com/sureshnaps - How should this be handled?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/aps-8id-dys/ipython-8idiuser/issues/127?email_source=notifications&email_token=AAYSAQ6YXYDJDGOABDZDYJLRAJIDBA5CNFSM4KMXX742YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKJUAHY#issuecomment-580075551, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAYSAQZ7V5FCTPQ5KDHEDIDRAJIDBANCNFSM4KMXX74Q.

prjemian commented 4 years ago

Good idea. For the morning. Current plan quit:

D Thu-00:01:36 - after count()
HDF5 workflow file name: /home/8-id-i/2020-1/bluesky/A2013/A2013_0001-0100.hdf
I Thu-00:01:36 - creating HDF5 file /home/8-id-i/2020-1/bluesky/A2013/A2013_0001-0100.hdf
I Thu-00:01:36 - DM workflow kickoff starting: analysis:True  file:/home/8-id-i/2020-1/bluesky/A2013/A2013_0001-0100.hdf
I Thu-00:01:36 - self.QMAP_FOLDER_PATH=/home/8-id-i/partitionMapLibrary/2020-1
I Thu-00:01:36 - self.XPCS_QMAP_FILENAME=Lambda_qmap.h5
I Thu-00:01:36 - DM Workflow call is made for XPCS Analysis: /home/8-id-i/2020-1/bluesky/A2013/A2013_0001-0100.hdf,  /home/8-id-i/partitionMapLibrary/2020-1/Lambda_qmap.h5  ----2020-01-30 00:01:36.874928
I Thu-00:01:36 - AD_Acquire starting
I Thu-00:01:36 - file_path = /home/8-id-i/2020-1/bluesky/A2014/
I Thu-00:01:36 - Shutter will *REMAIN OPEN THROUGH* the Acquisition...
I Thu-00:01:36 - calling full_acquire_procedure()
D Thu-00:01:36 - before update_metadata_prescan()
I Thu-00:01:36 - detNum=25, det_pars={'ccdHardwareRowSize': 516, 'ccdHardwareColSize': 1556, 'ccdxsense': 1, 'ccdzsense': 1, 'harmonic': 1, 'dpix': 0.055, 'saturation': 4095, 'adupphot': 1.0, 'efficiency': 1.0, 'gain': 1, 'blemish': 1, 'flatfield': 1, 'distortion': 0, 'parasitic': 0, 'lld': 0, 'manufacturer': 'LAMBDA'}
D Thu-00:01:37 - after update_metadata_prescan()
I Thu-00:01:37 - scaler should be autocounting now
D Thu-00:01:37 - before count()
Transient Scan ID: 2385     Time: 2020/01/30 00:01:37
Persistent Unique Scan ID: 'b888fa23-246f-44ef-9f04-dffea275b9c5'
New stream: 'baseline'
Start-of-run baseline readings:
+--------------------------------+--------------------------------+
+--------------------------------+--------------------------------+
New stream: 'Timebase_monitor'
New stream: 'pind1_monitor'
New stream: 'pind2_monitor'
New stream: 'Atten1_monitor'
New stream: 'Atten2_monitor'
New stream: 'lakeshore_loop1_temperature_monitor'
New stream: 'lakeshore_loop1_target_monitor'
I Thu-00:01:37 - DM workflow kickoff done
I Thu-00:01:37 - b'id=fba4acd6-f374-4dce-8b10-28addf74e2c9 owner=8idiuser status=pending startTime=1580364097.09 startTimestamp=2020/01/30 00:01:37 CST
...
... long trace
...
TimeoutError: Attempted to set EpicsSignalWithRBV(read_pv='8LAMBDA1:IMMout:Capture_RBV', name='lambdadet_immout_capture', parent='lambdadet_immout', value=0, timestamp=1580364096.606609, auto_monitor=False, string=False, write_pv='8LAMBDA1:IMMout:Capture', limits=False, put_complete=False) to value 1 and timed out after 10 seconds. Current value is 0.
prjemian commented 4 years ago

Our normal ops works because we write .imm to /data and not to /home/8-id-I and .hdf is written to 8-id-i. So I would suggest that you resume the ops by writing to /data/2020-1/bl** and that should work.

We need to review the code to make sure it does what you want.

8idiuser@quartz ~ $ ll /data
/bin/ls: cannot access /data: No such file or directory

The partition /data is not visible from workstations snow, bronze, or quartz.

sureshnaps commented 4 years ago

Pete, We had made it all work in the past. We pass the filepath as /data/*** and .py code transforms the path for metadata replacing /data with /home/8-id-I, the /data path only goes to the AD file path PV.

Suresh

From: Pete R Jemian notifications@github.com Reply-To: aps-8id-dys/ipython-8idiuser reply@reply.github.com Date: Thursday, January 30, 2020 at 10:11 AM To: aps-8id-dys/ipython-8idiuser ipython-8idiuser@noreply.github.com Cc: "Narayanan, Suresh" sureshn@anl.gov, Mention mention@noreply.github.com Subject: Re: [aps-8id-dys/ipython-8idiuser] test continuous scanning - normal data collection (#127)

Our normal ops works because we write .imm to /data and not to /home/8-id-I and .hdf is written to 8-id-i. So I would suggest that you resume the ops by writing to /data/2020-1/bl** and that should work.

We need to review the code to make sure it does what you want.

8idiuser@quartz ~ $ ll /data

/bin/ls: cannot access /data: No such file or directory

The partition /data is not visible from workstations snow, bronze, or quartz.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/aps-8id-dys/ipython-8idiuser/issues/127?email_source=notifications&email_token=AAYSAQ5OMRGWMU53AYKOSHLRAL3YHA5CNFSM4KMXX742YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKLRXFQ#issuecomment-580328342, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAYSAQ2IAEDOUYIUPRKL25TRAL3YHANCNFSM4KMXX74Q.

prjemian commented 4 years ago

Ok, I understand. We'll make sure the IMMout plugin's file path starts with /data/ and not /home/8-id-i/. (Also, we need to restart the detector IOC again.)

prjemian commented 4 years ago

next scan stopped with

OSError: Unable to create file (unable to open file: name = '/home/8-id-i/2020-1/bluesky/A2016/A2016_0001-0100.hdf', errno = 2, error message = 'No such file or directory', flags = 15, o_flags = c2)

In [3]: !ls /home/8-id-i/2020-1/bluesky/A2016/A2016_0001-0100.hdf                                                     
ls: cannot access /home/8-id-i/2020-1/bluesky/A2016/A2016_0001-0100.hdf: No such file or directory

In [4]: !ls /home/8-id-i/2020-1/bluesky/A2016/                                                                        
ls: cannot access /home/8-id-i/2020-1/bluesky/A2016/: No such file or directory

directory does not exist

prjemian commented 4 years ago

failed here


~/.ipython-bluesky/profile_bluesky/startup/50-controls.py in full_acquire_procedure()
   1029         print(f"HDF5 workflow file name: {hdf_with_fullpath}")
   1030 
-> 1031         dm_workflow.create_hdf5_file(hdf_with_fullpath)
   1032 
   1033         # update these str values from the string registers

~/.ipython-bluesky/profile_bluesky/startup/spec_support/APS_DM_8IDI.py in create_hdf5_file(self, filename, **kwargs)
    300 
    301         # any exception here will be handled by caller
--> 302         with h5py.File(filename, "w-") as f:

We need to make that directory now? Previously, area detector would make that, now we must be certain at this point.

prjemian commented 4 years ago

scanning again, hope for victory

prjemian commented 4 years ago

checkpoint

8idi:Reg173                    2020-01-30 11:56:49.280213 2017  
8idi:Reg173                    2020-01-30 11:57:00.330213 2018  
8idi:Reg173                    2020-01-30 11:57:11.213545 2019  
...
8idi:Reg173                    2020-01-30 12:09:09.409757 2084  
8idi:Reg173                    2020-01-30 12:09:20.510557 2085  
8idi:Reg173                    2020-01-30 12:09:31.410556 2086  
8idi:Reg173                    2020-01-30 12:09:42.427222 2087  
prjemian commented 4 years ago

note that screen output has other content that is not useful and can be turned off

D Thu-12:12:38 - after count()
HDF5 workflow file name: /home/8-id-i/2020-1/bluesky/A2102/A2102_0001-0100.hdf
I Thu-12:12:38 - creating HDF5 file /home/8-id-i/2020-1/bluesky/A2102/A2102_0001-0100.hdf
I Thu-12:12:38 - DM workflow kickoff starting: analysis:True  file:/home/8-id-i/2020-1/bluesky/A2102/A2102_0001-0100.hdf
I Thu-12:12:38 - self.QMAP_FOLDER_PATH=/home/8-id-i/partitionMapLibrary/2020-1
I Thu-12:12:38 - self.XPCS_QMAP_FILENAME=Lambda_qmap.h5
I Thu-12:12:38 - DM Workflow call is made for XPCS Analysis: /home/8-id-i/2020-1/bluesky/A2102/A2102_0001-0100.hdf,  /home/8-id-i/partitionMapLibrary/2020-1/Lambda_qmap.h5  ----2020-01-30 12:12:38.446747
I Thu-12:12:38 - AD_Acquire starting
I Thu-12:12:38 - file_path = /home/8-id-i/2020-1/bluesky/A2103/
I Thu-12:12:38 - Shutter will *REMAIN OPEN THROUGH* the Acquisition...
I Thu-12:12:38 - calling full_acquire_procedure()
D Thu-12:12:38 - before update_metadata_prescan()
I Thu-12:12:38 - detNum=25, det_pars={'ccdHardwareRowSize': 516, 'ccdHardwareColSize': 1556, 'ccdxsense': 1, 'ccdzsense': 1, 'harmonic': 1, 'dpix': 0.055, 'saturation': 4095, 'adupphot': 1.0, 'efficiency': 1.0, 'gain': 1, 'blemish': 1, 'flatfield': 1, 'distortion': 0, 'parasitic': 0, 'lld': 0, 'manufacturer': 'LAMBDA'}
D Thu-12:12:38 - after update_metadata_prescan()
I Thu-12:12:38 - scaler should be autocounting now
D Thu-12:12:38 - before count()
Transient Scan ID: 2474     Time: 2020/01/30 12:12:38
Persistent Unique Scan ID: '7e9b57fe-e90d-4738-ad25-2c704b1159e8'
I Thu-12:12:38 - DM workflow kickoff done
I Thu-12:12:38 - b'id=2b821519-9718-4510-8f29-7bb77b0f30b1 owner=8idiuser status=pending startTime=1580407958.62 startTimestamp=2020/01/30 12:12:38 CST \n'
New stream: 'baseline'
Start-of-run baseline readings:
+--------------------------------+--------------------------------+
+--------------------------------+--------------------------------+
New stream: 'Timebase_monitor'
New stream: 'pind1_monitor'
New stream: 'pind2_monitor'
New stream: 'Atten1_monitor'
New stream: 'Atten2_monitor'
New stream: 'lakeshore_loop1_temperature_monitor'
New stream: 'lakeshore_loop1_target_monitor'
I Thu-12:12:38 - full_name: /home/8-id-i/2020-1/bluesky/A2103/A2103_00001-00100.imm
closure() method endss. No progress bar available.]                                                                   
cam.acquire.value=0
immout.capture.value=0
immout.num_captured.value=100
status=DeviceStatus(device=lambdadet, done=True, success=True)                                                        
New stream: 'primary'                                                                                                 
+-----------+------------+
|   seq_num |       time |
+-----------+------------+
|         1 | 12:12:49.2 |
+-----------+------------+
generator lambda_test ['7e9b57fe'] (scan num: 2474)
End-of-run baseline readings:
+--------------------------------+--------------------------------+
+--------------------------------+--------------------------------+

See the table in the documentation for the BestEffortCallback: