Open campagnola opened 4 years ago
My thoughts:
FrameData
class that takes care of reformatting the data on-demand (for example, expanding 12-bit packed data or converting from weird colorspaces)Just wanted to link to this repo, they are struggling to come up with a solution that is both efficient and easy to implement for newcomers.
@aquilesC I see object proxying in that link; can you clarify how that relates to camera data formatting?
Perhaps should have linked to the proper line. Part of the discussion is in the docstrings of the classes. They struggle with 1. getting fast data transfer rates out of cameras -> therefore implemented shared numpy arrays, 2. teaching new people in the lab how to use shared memory, so they are trying to find sort of an API for it.
In general I love this approach of using object proxying and shared memory; makes for very clean multiprocessing in Python. I helped write a similar approach for pyacq several years ago where we implemented @samuelgarcia's idea of streams.
Still, I think this belongs in a layer above the device drivers. So long as you can instruct the device driver to write directly into your shared memory buffer, you should be able to achieve good separation of concerns without a performance hit.
Hi all. I don't known excatly the purpose of what you are discussing here. I would be happy to make a presentatgion of pyacq and choice we made.
In short, we use proxy (same or differents machine) + multiprocessing. For stream we need a very very flexible concept of stream. Sharememory a one possible scenario. Python3.8 add the super new feature. It should be easier now. But in pyacq, we also have stream with data copy with socket based (zmq). Very usefull for differents machine. An impoprtant stuff is also to make flexible the memory layout (transpose or not) because depending the needs all approach must be possible within the same framework.
For instance , in multichannel signal (channel, time) vs (time, channel) is always a big debate between devs. In paycq there both are possible. numpy strides is very helpfull for that.
If you are bulding (or about to build), a package for python acquisition, pyacq is already for that. I would be happy to improve/break/refactor evrything that would make other dev happy to avoid duplicate of effort for grabbing data in ptyhon community.
I am in France but would be happy to have a video call.
I'd think that shared memory, object proxies, streaming etc ... reside at a higher level. What we should be aiming for is a structure which doesn't preclude that downstream.
As to duplication of effort - that ship might have already sailed - we already have quite mature streaming support etc .. in python-microscopy, but to bake that into a device driver seems a bit un-necessary.
A FrameData
object with support for transformations might not be unreasonable, as long as it was light weight, didn't perform any transformations by default, and permitted simple access to the underlying frame memory (e.g. as a numpy array) without copying. If you are potentially running at several KHz (entirely possible on a ROI on sCMOS) I'd worry a little about the construction/allocation overhead of a FrameData object.
It's also a little hard to know how to manage things completely without copying whilst still maintaining expected/sane behaviour for anyone using the data downstream (i.e. if you just supply a reference to a slot in the camera's circular buffer it has an implicit expiry time after which it get's filled with new data and is no longer valid). Putting such a frame, e.g. on a queue to be spooled to disk has obvious potential issues if you run into anything which means that your spooling is temporarily slowed). I'll post a brief description of what we currently use and it's strengths and weaknesses in the hopes that it's useful for stimulating discussion.
Will also note that we've also played with shared memory arrays (https://github.com/python-microscopy/python-microscopy/blob/master/PYME/util/shmarray/shmarray.py). We've just used them for data analysis, not streaming, and there are a bunch of restrictions on how they can be used, especially on windows. They are rather sensitive to how processes are forked by multi-processing and need to be pre-allocated before anything forks . I'm not sure how useful shared memory is in a data acquisition sense though. For me multiprocessing with shared memory is really good for compute-intensive tasks, but not that helpful when things are limited by IO and memory bandwidth (threading is usually better for IO concurrency, and there is not a lot you can do about memory bandwidth).
Current PYME camera API (https://github.com/python-microscopy/python-microscopy/blob/master/PYME/Acquire/Hardware/Camera.py). Forgive the horrible method names which are a fairly gross throwback to legacy code. Anyway, the camera data handling is implemented in two methods - ExpReady()
which can be polled to see if there is data waiting, and ExtractColour(output)
which copies the oldest frame from the camera buffer into a numpy array, output, provided by the user. ExtractColour()
would be better named something like get_frame_data(output)
(it used to do de-bayering as well as just getting data).
How this is treated under the hood varies between cameras, depending on how the underlying API is written. The AndorIxon camera class, for example, passes a pointer to the numpy array (output.ctypes.data
) directly to an Andor API function which copies the data from an API internal circular buffer into the numpy array. The APIs for the sCMOS cameras (Andor and Hamamatsu) however offload handling of frame buffers to the calling code so the PYME adapter for these cameras implement their own circular buffers in python. In this case, ExtractColour
method does a memcopy (not a numpy copy - we call memcopy from the c std library using ctypes as it's a lot faster if super gross) between the camera class circular buffer and the provided output buffer. In both cases, you need to be a bit careful about how you allocate the 'output' array (needs to be contiguous with the right byte order and alignment).
The good
The bad
get_frame_data()
, although that also has issues)The ugly (these are mostly implementation details which are fixable)
just a note to the above - the current PYME architecture would allow you to pass, e.g. , a shared memory array to receive data if you really wanted to.
It's also a little hard to know how to manage things completely without copying whilst still maintaining expected/sane behaviour for anyone using the data downstream (i.e. if you just supply a reference to a slot in the camera's circular buffer it has an implicit expiry time after which it get's filled with new data and is no longer valid).
@David-Baddeley in our current prototype, the FrameData class can point to any array-like (numpy, shared memory, mmap) and would raise an exception if its data had been overwritten before access. There are a couple tweaks that could make it better, but I think it could cover any of the cases you described above with good performance. If you have time, I'd love to hear whether you see anything problematic in that architecture.
@David-Baddeley writes: