Open gmysage opened 6 years ago
Was at FXI with @gmysage diagnosing the problem.
@danielballan, @tacaswell, any ideas?
I am not surprised, by this. When you trigger the camera from bluesky we wait for the underlying area detector IOC to report that the image is acquired and through all of the AD pipeline processing.
What plugins do you have setup on AD? Turning off ones you do not need will probably help a bit. When in 'free run' mode you are probably not saving images and have blocking callbacks turned off. For data collection you are probably saving files and we turn blocking callbacks on (to make sure everything stays synchronized).
You probably want to monitor the motor position and then re-align them in analysis.
To get a sense of what the timing of the underlying system is, set the Andor up as it would be for data collection (just calling andor.stage()
is probably enough) and then in one terminal use camonitor
to watch the acquire bit and use CSS or caput
to trigger an image acquisition cycle.
I turned off all the plugins except for "Image1, PVA1, PROC1, TRANS1 FileHDF1".
When doing the scan as: RE(count([Andor], 1))
, it still gives a ~0.5s wait time --> 0.5s/image.
See:
RE(count([Andor], 1))
14:27:12.940888 stage -> Andor args: (), kwargs: {}
14:27:13.035984 open_run -> None args: (), kwargs: {'detectors': ['Andor'], 'num_points': 1, 'num_intervals': 0, 'plan_args': {'detectors': ["Manta(prefix='XF:18IDB-BI{Det:Neo}', name='Andor', read_attrs=['hdf5'], configuration_attrs=['cam'])"], 'num': 1}, 'plan_name': 'count', 'hints': {'dimensions': [(('time',), 'primary')]}}
Transient Scan ID: 312 Time: 2018/02/23 14:27:13
Persistent Unique Scan ID: '3bbeda5c-d694-4435-8683-814b30ac8bb0'
14:27:13.270258 checkpoint -> None args: (), kwargs: {}
14:27:13.270577 trigger -> Andor args: (), kwargs: {'group': 'trigger-455549'}
14:27:13.273460 wait -> None args: (), kwargs: {'group': 'trigger-455549'}
14:27:19.798874 create -> None args: (), kwargs: {'name': 'primary'}
14:27:19.798985 read -> Andor args: (), kwargs: {}
14:27:19.809784 save -> None args: (), kwargs: {}
New stream: 'primary'
As I report previously, If I set to take 100 images per each triggering, and redo the scan: RE(count([Andor], 1))
, the total wait time is 5.5sec --> 0.055sec/image:
In [5]: RE(count([Andor], 1))
14:27:12.940888 stage -> Andor args: (), kwargs: {}
14:27:13.035984 open_run -> None args: (), kwargs: {'detectors': ['Andor'], 'num_points': 1, 'num_intervals': 0, 'plan_args': {'detectors': ["Manta(prefix='XF:18IDB-BI{Det:Neo}', name='Andor', read_attrs=['hdf5'], configuration_attrs=['cam'])"], 'num': 1}, 'plan_name': 'count', 'hints': {'dimensions': [(('time',), 'primary')]}}
Transient Scan ID: 312 Time: 2018/02/23 14:27:13
Persistent Unique Scan ID: '3bbeda5c-d694-4435-8683-814b30ac8bb0'
14:27:13.270258 checkpoint -> None args: (), kwargs: {}
14:27:13.270577 trigger -> Andor args: (), kwargs: {'group': 'trigger-455549'}
14:27:13.273460 wait -> None args: (), kwargs: {'group': 'trigger-455549'}
14:27:19.798874 create -> None args: (), kwargs: {'name': 'primary'}
14:27:19.798985 read -> Andor args: (), kwargs: {}
14:27:19.809784 save -> None args: (), kwargs: {}
New stream: 'primary'
For both cases, images are saved. I think for our flyscan purpose, we want to achieve at least 15 images/sec.
Is there a zebra/pizza box involved for the position capture/trigger or is this a software level fly scan implemented in ophyd/bluesky?
@gmysage I strongly suspect that the most costly plugin is also an indispensable one: FileHDF1. It just takes time to write data to disk. To echo @tacaswell, your 17 Hz "continuous mode" measurement is (I assume!) not writing any data, so it's fast.
Thanks for posting those numbers. When I look at them, I am not surprised or concerned. I observe:
Your best bet is to take images in large (~100 image) chunks to get that 10 Hz rate you measured. Do you understand @tacaswell's comment, "You probably want to monitor the motor position and then re-align them in analysis"? Do you need some help implementing that?
Let's get that working, to start.
Looking ahead, getting all the way to 15 Hz will take more effort. You'll may have to turn off blocking callbacks, which can result in data loss and must be done very carefully. You may have to use more advanced capture modes on the HDF5 plugin, as CSX does, which requires recent versions of Area Detector itself -- something that Kaz says he is working on.
@cowanml This is a software-level fly scan. Let's keep it simple for now and bring in zebra/pizza box only if we have to.
@danielballan @tacaswell I may need some help to implement camonitor. I can monitor other motor movement using e.g., camonitor XF:18IDB-OP{SSA:1-Ax:Vgap}Mtr.RBV
, but I don't know how to monitor a camera, since I didn't move any motor in the scan `RE(count([Andor], 1)'
I try to disable the HDFplugin. So I comment out all the hdf component for the camera class as:
class Manta(SingleTrigger, AreaDetector):
image = Cpt(ImagePlugin, 'image1:')
# stats1 = Cpt(StatsPlugin, 'Stats1:')
# stats2 = Cpt(StatsPlugin, 'Stats2:')
# stats3 = Cpt(StatsPlugin, 'Stats3:')
# stats4 = Cpt(StatsPlugin, 'Stats4:')
# stats5 = Cpt(StatsPlugin, 'Stats5:')
trans1 = Cpt(TransformPlugin, 'Trans1:')
# roi1 = Cpt(ROIPlugin, 'ROI1:')
# roi2 = Cpt(ROIPlugin, 'ROI2:')
# roi3 = Cpt(ROIPlugin, 'ROI3:')
# roi4 = Cpt(ROIPlugin, 'ROI4:')
proc1 = Cpt(ProcessPlugin, 'Proc1:')
# hdf5 = Cpt(HDF5PluginWithFileStore,
# suffix='HDF1:',
# write_path_template='/NSLS2/xf18id1/DATA/detA1/commissioning',
# root='/NSLS2/xf18id1/DATA/detA1/',
# reg=None) # placeholder to be set on instance as obj.hdf5.reg
def stop(self):
self.hdf5.capture.put(0)
return super().stop()
def pause(self):
self.hdf5.capture.put(0)
return super().pause()
def resume(self):
self.hdf5.capture.put(1)
return super().resume()
@property
def hints(self):
return {'fields': [self.stats1.total.name,
self.stats5.total.name]}
Andor = Manta('XF:18IDB-BI{Det:Neo}', name='Andor')
Andor.stage_sigs['cam.image_mode'] = 0
As a test, after the modification, when I type Andor.stage()
, hdfplug keep being 'disabled' in CSS.
Then, when I run the scan, it still gives large wait time. And if I try to fetch the image data using: img=np.array(list[db[-1].data('Andor_image')])
, it shows no data....
In [1]: RE(count([Andor], 1))
16:15:47.302287 stage -> Andor args: (), kwargs: {}
16:15:48.171506 open_run -> None args: (), kwargs: {'detectors': ['Andor'], 'num_points': 1, 'num_intervals': 0, 'plan_args': {'detectors': ["Manta(prefix='XF:18IDB-BI{Det:Neo}', name='Andor', read_attrs=['hdf5'], configuration_attrs=['cam'])"], 'num': 1}, 'plan_name': 'count', 'hints': {'dimensions': [(('time',), 'primary')]}}
Transient Scan ID: 315 Time: 2018/02/23 16:15:48
Persistent Unique Scan ID: '6e9f0fc1-5945-42e3-89cd-9be8a2105b39'
16:15:48.350687 checkpoint -> None args: (), kwargs: {}
16:15:48.350954 trigger -> Andor args: (), kwargs: {'group': 'trigger-a92fe2'}
16:15:48.362670 wait -> None args: (), kwargs: {'group': 'trigger-a92fe2'}
16:15:48.879306 create -> None args: (), kwargs: {'name': 'primary'}
16:15:48.879459 read -> Andor args: (), kwargs: {}
16:15:49.038213 save -> None args: (), kwargs: {}
New stream: 'primary'
Dear all, do you have more suggestions? Thanks,
I have done the following to test the time consumed by saving image. The following results suggest that communication to confirm "some thing is successfully done" takes a significant amount of overhead.
I run the following two types of scripts to test the time to save images, and use camonitor to monitor the PV XF:18IDB-BI{Det:Neo}TIFF1:WriteFile
:
1st:
cap_img = 'XF:18IDB-BI{Det:Neo}TIFF1:WriteFile'
for i in range(10):
my_save_cmd = 'caput ' + cap_img + ' ' + 'Write'
subprocess.Popen(my_save_cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
time.sleep(0.02)
for this script, camonitor gives following results. The time for single step is <0.04s. I have also confirm that I have saved 10 images in my directory. If I reduce the sleep time(e.g. 0.01), I will miss images.
XF:18IDB-BI{Det:Neo}TIFF1:WriteFile 2018-02-26 13:50:04.809234 Done
XF:18IDB-BI{Det:Neo}TIFF1:WriteFile 2018-02-26 13:50:04.838784 Done
XF:18IDB-BI{Det:Neo}TIFF1:WriteFile 2018-02-26 13:50:04.877956 Done
XF:18IDB-BI{Det:Neo}TIFF1:WriteFile 2018-02-26 13:50:04.915422 Done
XF:18IDB-BI{Det:Neo}TIFF1:WriteFile 2018-02-26 13:50:04.957302 Done
XF:18IDB-BI{Det:Neo}TIFF1:WriteFile 2018-02-26 13:50:04.995035 Done
XF:18IDB-BI{Det:Neo}TIFF1:WriteFile 2018-02-26 13:50:05.036236 Done
XF:18IDB-BI{Det:Neo}TIFF1:WriteFile 2018-02-26 13:50:05.078135 Done
XF:18IDB-BI{Det:Neo}TIFF1:WriteFile 2018-02-26 13:50:05.114900 Done
2nd script:
cap_img = 'XF:18IDB-BI{Det:Neo}TIFF1:WriteFile'
for i in range(10):
my_save_cmd = 'caput ' + cap_img + ' ' + 'Write'
subprocess.Popen(my_save_cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
while True:
r =subprocess.check_output(['caget', cap_img, '-t']).rstrip()
r = str(r)[2:-1]
if r == 'Done':
break
In this script, I code it to confirm the status of saving image before starting another image saving. The camonitor gives an average time for single step of 0.12s. Following is the output of camonitor:
XF:18IDB-BI{Det:Neo}TIFF1:WriteFile 2018-02-26 14:02:00.613954 Done
XF:18IDB-BI{Det:Neo}TIFF1:WriteFile 2018-02-26 14:02:17.837061 Done
XF:18IDB-BI{Det:Neo}TIFF1:WriteFile 2018-02-26 14:02:17.958904 Done
XF:18IDB-BI{Det:Neo}TIFF1:WriteFile 2018-02-26 14:02:18.084958 Done
XF:18IDB-BI{Det:Neo}TIFF1:WriteFile 2018-02-26 14:02:18.209414 Done
XF:18IDB-BI{Det:Neo}TIFF1:WriteFile 2018-02-26 14:02:18.332542 Done
XF:18IDB-BI{Det:Neo}TIFF1:WriteFile 2018-02-26 14:02:18.449850 Done
XF:18IDB-BI{Det:Neo}TIFF1:WriteFile 2018-02-26 14:02:18.563026 Done
XF:18IDB-BI{Det:Neo}TIFF1:WriteFile 2018-02-26 14:02:18.681280 Done
XF:18IDB-BI{Det:Neo}TIFF1:WriteFile 2018-02-26 14:02:18.808993 Done
@gmysage, thanks for the test results. Are you available tomorrow after 5 pm? @danielballan and me can come and try to help you at the beamline.
Thank you. I will meet you tomorrow at 5pm at beamline.
It looks like triggering the detector takes a large portion of overhead time. Here is what I have tried (exposure time and acquire period are set to 0.05s for all test):
For Andor camera, I can reach 17 Hz if running in continuous mode in CSS.
If I run the scan:
RE(count([Andor], 100)
, it takes 90sec for 100 images, corresponding to scanning rate of 1.2HzIf I choose to take multiple images (e.g. 10 images) per each triggering, and run
RE(count([Andor], 10))
, it gives a image shape of (10, 10, 2160, 2560), effectively 100 images using 17 sec, which means the scanning rate is about 5.9 HzIf I choose to take 100 images per each triggering, and run
RE(count([Andor], 1))
, it gives a image shape of (1, 100, 2160,2560), and it takes about 10sec, so it is 10 HzThe problem in taking multiple images per triggering is that, it does not record the motor position (e.g. rotation stage in "fly scan")
Based on the observation, it looks like triggering the detector takes a lot of overhead: