APS-4ID-POLAR / ipython-polar

4-ID-Polar ipython configuration for bluesky (and other)
1 stars 3 forks source link

Eiger setup #130

Closed gfabbris closed 2 years ago

gfabbris commented 3 years ago

I'm trying to setup the Eiger detector and this is mostly to keep track of the progress/issues. Notable points so far:

gfabbris commented 3 years ago

Issues to work on:

  1. Setup the relevant Kind.

Figure out how images are saved and best way to handle them within Bluesky:

  1. I had issues getting the file writer to work (without Bluesky). It seems like that images may not be saved if the acquisition is stopped (? not sure about this).
  2. Do we have to do one HDF5 file per scan (with multiple images)? Maybe it'd be better to have one file per image, but that would probably duplicate metadata and take more disk space.
  3. How to integrate this file writer to the ophyd FileStore? See apstools.AD_EpicsHdf5FileName.
gfabbris commented 3 years ago

Item 2.

This problem seems to come up only if eiger.cam.num_images_per_file == 1.

gfabbris commented 3 years ago

I did some more tests on Item 2 with this and it seems to be fine with eiger.cam.num_images_per_file == 1. I can't easily reproduce it.

gfabbris commented 3 years ago

Do we need the ImagePlugin?? (I think no...)

Any other plugins???

gfabbris commented 3 years ago

Some resources

From NSLS2

https://github.com/NSLS-II-CHX/profile_collection/blob/master/startup/20-area-detectors.py

More complex triggers

https://github.com/bluesky/ophyd/blob/d0bc07ef85cf2d69319a059485a21a5cdbd0e677/ophyd/areadetector/trigger_mixins.py#L150 https://github.com/NSLS-II-CHX/profile_collection/blob/de6e03125a16f27f87cbce245bf355a2b783ebdc/startup/20-area-detectors.py#L24

Filestore

The file plugins are not used here (images saved by detector), but these might provide some guidance...

https://github.com/BCDA-APS/apstools/blob/0d3a7a2ca2305bc6a5d32be1def333f14352f07e/apstools/devices.py#L2016

class LocalHDF5Plugin(HDF5Plugin, FileStoreHDF5Single):
    pass

class LocalHDF5Plugin(HDF5Plugin, FileStoreHDF5SingleIterativeWrite):
    pass

class LocalHDF5Plugin(HDF5Plugin, AD_EpicsHdf5FileName):
    pass
gfabbris commented 3 years ago

To do (as of 08/06)

  1. Test that the latest changes work.
  2. Implement file writer (see nsls2 CHX).
  3. Is there a way to stage/unstage without using set_and_wait and status_wait?
  4. Trim down the configuration_attrs - there is a lot now...
gfabbris commented 3 years ago

The file writer seem to be working alright, but I have issues reading the images from databroker. The problem doesn't seem to be location of files, but something to do with the slicerator library (see below). I'm using a modified version of the eiger_io.fs_handler_dask.EigerHandlerDask: https://github.com/APS-4ID-POLAR/ipython-polar/blob/333d7bcae8cb87b9990136ecca69332215ad905e/profile_bluesky/startup/instrument/framework/eiger_handler.py#L55

@prjemian : have you seen an error like this? Any potential issue that you can see?

In [155]: data = db[-1].primary.to_dask()

In [156]: data
Out[156]: 
<xarray.Dataset>
Dimensions:                                (time: 5, dim_0: 1, dim_1: 514, dim_2: 1030)
Coordinates:
  * time                                   (time) float64 1.629e+09 ... 1.629...
Dimensions without coordinates: dim_0, dim_1, dim_2
Data variables: (12/19)
    eiger_cam_num_images_counter           (time) int64 dask.array<chunksize=(1,), meta=np.ndarray>
    eiger_image                            (time, dim_0, dim_1, dim_2) float64 dask.array<chunksize=(1, 1, 514, 1030), meta=np.ndarray>
    eiger_file_sequence_id                 (time) int64 dask.array<chunksize=(1,), meta=np.ndarray>
    eiger_file_file_path                   (time) <U41 dask.array<chunksize=(1,), meta=np.ndarray>
    eiger_file_file_write_name_pattern     (time) <U7 dask.array<chunksize=(1,), meta=np.ndarray>
    eiger_file_file_write_images_per_file  (time) int64 dask.array<chunksize=(1,), meta=np.ndarray>
    ...                                     ...
    eiger_stats3_max_value                 (time) float64 dask.array<chunksize=(1,), meta=np.ndarray>
    eiger_stats3_min_value                 (time) float64 dask.array<chunksize=(1,), meta=np.ndarray>
    eiger_stats3_total                     (time) float64 dask.array<chunksize=(1,), meta=np.ndarray>
    eiger_stats4_max_value                 (time) float64 dask.array<chunksize=(1,), meta=np.ndarray>
    eiger_stats4_min_value                 (time) float64 dask.array<chunksize=(1,), meta=np.ndarray>
    eiger_stats4_total                     (time) float64 dask.array<chunksize=(1,), meta=np.ndarray>

In [157]: data["eiger_image"]
Out[157]: 
<xarray.DataArray 'eiger_image' (time: 5, dim_0: 1, dim_1: 514, dim_2: 1030)>
dask.array<stack, shape=(5, 1, 514, 1030), dtype=float64, chunksize=(1, 1, 514, 1030), chunktype=numpy.ndarray>
Coordinates:
  * time     (time) float64 1.629e+09 1.629e+09 1.629e+09 1.629e+09 1.629e+09
Dimensions without coordinates: dim_0, dim_1, dim_2
Attributes:
    object:   eiger

In [158]: data["eiger_image"].compute()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-158-b726dcee5a30> in <module>
----> 1 data["eiger_image"].compute()

~/.conda/envs/bluesky_2021_2/lib/python3.8/site-packages/xarray/core/dataarray.py in compute(self, **kwargs)
    953         """
    954         new = self.copy(deep=False)
--> 955         return new.load(**kwargs)
    956 
    957     def persist(self, **kwargs) -> "DataArray":

~/.conda/envs/bluesky_2021_2/lib/python3.8/site-packages/xarray/core/dataarray.py in load(self, **kwargs)
    927         dask.compute
    928         """
--> 929         ds = self._to_temp_dataset().load(**kwargs)
    930         new = self._from_temp_dataset(ds)
    931         self._variable = new._variable

~/.conda/envs/bluesky_2021_2/lib/python3.8/site-packages/xarray/core/dataset.py in load(self, **kwargs)
    863 
    864             # evaluate all the dask arrays simultaneously
--> 865             evaluated_data = da.compute(*lazy_data.values(), **kwargs)
    866 
    867             for k, data in zip(lazy_data, evaluated_data):

~/.conda/envs/bluesky_2021_2/lib/python3.8/site-packages/dask/base.py in compute(*args, **kwargs)
    566         postcomputes.append(x.__dask_postcompute__())
    567 
--> 568     results = schedule(dsk, keys, **kwargs)
    569     return repack([f(r, *a) for r, (f, a) in zip(results, postcomputes)])
    570 

~/.conda/envs/bluesky_2021_2/lib/python3.8/site-packages/dask/threaded.py in get(dsk, result, cache, num_workers, pool, **kwargs)
     77             pool = MultiprocessingPoolExecutor(pool)
     78 
---> 79     results = get_async(
     80         pool.submit,
     81         pool._max_workers,

~/.conda/envs/bluesky_2021_2/lib/python3.8/site-packages/dask/local.py in get_async(submit, num_workers, dsk, result, cache, get_id, rerun_exceptions_locally, pack_exception, raise_exception, callbacks, dumps, loads, chunksize, **kwargs)
    512                             _execute_task(task, data)  # Re-execute locally
    513                         else:
--> 514                             raise_exception(exc, tb)
    515                     res, worker_id = loads(res_info)
    516                     state["cache"][key] = res

~/.conda/envs/bluesky_2021_2/lib/python3.8/site-packages/dask/local.py in reraise(exc, tb)
    323     if exc.__traceback__ is not tb:
    324         raise exc.with_traceback(tb)
--> 325     raise exc
    326 
    327 

~/.conda/envs/bluesky_2021_2/lib/python3.8/site-packages/dask/local.py in execute_task(key, task_info, dumps, loads, get_id, pack_exception)
    221     try:
    222         task, data = loads(task_info)
--> 223         result = _execute_task(task, data)
    224         id = get_id()
    225         result = dumps((result, id))

~/.conda/envs/bluesky_2021_2/lib/python3.8/site-packages/dask/core.py in _execute_task(arg, cache, dsk)
    119         # temporaries by their reference count and can execute certain
    120         # operations in-place.
--> 121         return func(*(_execute_task(a, cache) for a in args))
    122     elif not ishashable(arg):
    123         return arg

~/.conda/envs/bluesky_2021_2/lib/python3.8/site-packages/slicerator/__init__.py in __getitem__(self, i)
    184             def __getitem__(self, i):
    185                 """Getitem supports repeated slicing via Slicerator objects."""
--> 186                 indices, new_length = key_to_indices(i, len(self))
    187                 if new_length is None:
    188                     return self._get(indices)

~/.conda/envs/bluesky_2021_2/lib/python3.8/site-packages/slicerator/__init__.py in key_to_indices(key, length)
    305         else:
    306             # The key is a list of in-range values. Check if they are in range.
--> 307             if any(_k < -length or _k >= length for _k in key):
    308                 raise IndexError("Keys out of range")
    309             rel_indices = ((_k if _k >= 0 else length + _k) for _k in key)

~/.conda/envs/bluesky_2021_2/lib/python3.8/site-packages/slicerator/__init__.py in <genexpr>(.0)
    305         else:
    306             # The key is a list of in-range values. Check if they are in range.
--> 307             if any(_k < -length or _k >= length for _k in key):
    308                 raise IndexError("Keys out of range")
    309             rel_indices = ((_k if _k >= 0 else length + _k) for _k in key)

TypeError: '<' not supported between instances of 'NoneType' and 'int'
prjemian commented 3 years ago

Might not be something slicerator can handle. The root problem (if any(_k < -length or _k >= length for _k in key, line 307 of key_to_indices()) seems to be a None object in key and that probably ties back into databroker (through dask) trying to find the image. @danielballan might have some insight

gfabbris commented 3 years ago

Ok, I got this to work! 😄

  1. Found the area_detector_handlers package. The EigerHandler in there seems to be the latest iteration of it.
  2. Realized that the datum_kwargs from NSLS2 (seq_id) is not good for us, since it does not track the images. So changed it to {'image_num': eiger.num_images_counter.get()}, and modified the __call__ in the handler accordingly.
  3. I don't understand why, but Bluesky is expecting the images with an extra axis. So instead of being shape = (514, 1030), it needs to be shape = (1, 514, 1030).
prjemian commented 3 years ago
  1. ... extra axis

That comes from AD itself. AD allows for a number of frames per collection where the additional axis corresponds to the timestamp of the frame. So for single frame collection, the extra axis has length 1. See an example in one of the training notebooks.