Bulk Handling - Githubissues

hmaarrfk commented 10 years ago

I added stub functions for many things in the ps6000 programmer guide.

It would be nice if we could support Bulk or Streaming options.

Bulk handling basically requires us to be very careful with memory management. It only makes sense for Bulk to ask the user for the numpy array to use.

For this to be possible, we need to check a few of the numpy array flags, like CONTIGUOUS.

Use case where this is not true:

User creates an array
User slices array
User passes sliced array hoping to store information in a sliced fashion

We don't have to check this in our current functions because we always create an `empy array

I might actually have to use this soon so it might be implemented in the near future. But I just wanted to document the possible bug.

flothesof commented 8 years ago

Hi Mark,

just came across this issue you opened. Is it actually already possible, with what is implemented, to use the Rapid Block Mode described in my Pico 3404 manual? I'm using normal block mode but missing some triggers, so I was wondering if there was any straightforward way of programming the pico using pico-python that would maximize acquisition speed.

Thanks for your help. Best regards, Florian

hmaarrfk commented 8 years ago

Hey Florian.

I don't think there is anything stopping you from using rapid block mode. You will have to call the lowlevel functions youself. If you import

from ctypes import byref, POINTER, create_string_buffer, c_float, \
    c_int16, c_int32, c_uint16, c_uint32, c_void_p
from ctypes import c_int32 as c_enum

you should be able to call the clibraries yourself using something like

your_device.lib.ps3000aGetUnitInfo( c_XXXX(param1), c_XXXX(param2))

I haven't had time to test it though.

You will really have to watch out about the array being continuous. Have a look at https://github.com/colinoflynn/pico-python/blob/master/picoscope/picobase.py#L480 I setup the numpy array, then tell the clibrary to write to it. To my knowledge, numpy works by allocating a chunk of memory for the array, and adding a bit of overhead for useful things like size and rapid indexing. Therefore the initial allocated memory is readily compatible with carrays.

I remember there being a few flow diagrams in the documentation that describe the functions that need to be called. They should help you get started.

The reason I didn't work too hard on this problem is that, RapidBlockMode is basically a guaranteed way to get a segfault. Also, the particular array used is going to be application dependent, and might not be a numpy array. That is why I left it to those that really need the speed to implement themselves.

Segfault example

Allocate array
setup rapid block mode
Delete array
rapid block mode writes to invalid memory

I think the way to implement it, would be to setup flags, and automatically stop rapid block mode before the array is deleted.

If you get it working, and have time to create a short example script, I'll include the example in the code base. If you can figure out how to get flags (maybe just change the destructor) to change the array destructor so that rapid block mode stops, we can include rapid block mode into PicoBase so that others can use it easily.

Goodluck!

hmaarrfk commented 8 years ago

The only thing that the functions https://github.com/colinoflynn/pico-python/blob/master/picoscope/ps6000.py#L483 do is that they convert between pythonic types and ctypes.

You should be able to use them in a straightforward manner. Don't know about ps3000, but I imagine they are similar.

morgatron commented 7 years ago

I had this up and running a year or so ago ( got a bit life side-tracked in the meantime). Based on my experience, I think the segfault risk isn't actually too bad, as long as the code doing rapid-block mode stuff has a reference to the array independent of the user's so the python garbage collector won't delete it even if the user does. But in any case, there's only a risk if the user is asked to supply the raw memory. If the user instead just supplies the array shape it's easy enough to make it and return a view of it that they can't delete. Further, views of numpy arrays have a 'writeable' flag that can be unset to make them read-only, so the user can also be prevented from confusing themselves by writing to active memory.

One step further still, I used to use a little pyZMQ based server whose job was to read, pre-process, and 'publish' the data so that several other programs could access the stream independently. It seemed to work quite well with a ~10 MS/s stream read by 3 or 4 processes.

I've got what I think is some mostly robust code along the lines of the above, which I could clean up if there's interest. Of course, even without the pyZMQ serving thing it's a little more processing than you might have in mind for the picopython base.

hmaarrfk commented 7 years ago

Hi morgatron,

Nice work. Sounds interesting.

How far did you go into changing the picopython files? Could you supply "example" files that augment (or whatever the pythonic word for class extension is) the picopython files with your implementation? I think it would be interesting to keep them in the examples directory we have.

If your "_lowlevel" functions do not directly allocate memory, maybe we can pull in the changes into the main branch.

I think you mentionned in #59 that the processing was actually the bottlekneck in your case. I've found in other projects that quite often, just copying memory back and forth a few times slowed things down. Maybe I wasn't doing it as efficiently as I should. I honestly don't mind requiring numpy, but somebody would have to make an executive decision in terms of what version to support, my python is a little rusty.

It might be good to keep your example for reference purposes. I always used to use the provided interface to look at data with my human eyes in real time.

colinoflynn commented 7 years ago

Sounds interesting indeed! Would be great to get support (I have no problem with numpy either... anyone using picopython is probably going to need it anyway for further processing anyway).

If it's going to be a lot of work (or hacking/changes in the main codebase), an option might be to either integrate it under just an example, OR even reference your project within our documentation? At least so someone else who is trying to implement the same (or similar) work doesn't miss that you've already put a ton of effort into this!

Heck even without cleanup we could just point to it from the documentation to avoid the risk of further sidetracking ;-) I know myself I've got loads of partially done code, and not enough time to publish all of it in "proper" projects, but it's good enough for a quick demo which might give someone a starting point.

morgatron commented 7 years ago

Excellent point about lots or partially done code... Nonetheless, I've though about it a bit more and I suspect it won't be too hard to do properly.

I was looking over a few code scraps last night, I think I understand most of what 'Morgan of the past' was up to. I had a few overly grand and fancy plans but I think the most simple and sensible isn't too different from what I've actually already put on github.

I'm afraid I made it a bit annoying because I just went in and started changing files all over the place. But the important stuff I think is quite minimal: adjust the data allocation methods to store a reference to the array on the picoscope object (as e.g. self._data).

To reiteratehe safety is there for two reaons: most users will be happy if the memory allocation is done behind the scenes, as long as they get a numpy array with the data in it when they call getDataXXX functions. If the user is supplying their own array, which they might do for optimization purposes, we assume they know a little of what they're doing. We'll check for contiguity, and store a reference. If the user did the expected, that is made a numpy array and passed it into our function, all is good because if they delete it in python land, the reference stored at self._data ensures the memory remains. If they try hard, they could cause a segfault by using memory not tracked in python-land: e.g. allocating memory with an external C library and using a pointer to it to make the numpy array they pass in. Then if they delete it in C-land they'll get a problem. However, I'd arguing that someone doing this is knowledgeable enough to know it's dangerous.

With the simple change of storing the reference, I think it's pretty much just thinly wrapping the other lowLevel methods.

So with specifics, here's my(n over-) simple solution:

Make a new function, "allocateBuffers" that wraps the _lowLevelSetBuffer-type functions and use that everywhere in place of the lowLevelSetBuffer routines. It either returns a numpy view of the appropriate memory (given the user desired memory segments) or will take an array from the user, make sure it's contiguous (raise exception if not), and set that as the target memory. In either case, it'll store a reference to the array as "self._data".
Call allocateBuffers at the start of every getDataXXX call. This ensures that the user won't have their data overwritten unexpectedly on subsequent calls, unless they're explicitly naming the memory they want it stored in. It has the downside of extra overhead as it reallocates the memory every time. Other options include returning a copy of the data, using _lowLevelClearBuffer functions, or requiring the user to call 'allocateBuffers' seperately. Not sure what is best here.

That's it, rapid block mode should be safe and useable.

Streaming does still remain a bit tricky for most people though, because writing a callback needs some C knowledge, and presumably introduces other ways to cause a segfault. But I think most users can get what they want if we write a simple one for them. So I suggest:

Add a runStreamingSimple function. It takes extra optional arguments of buffer size and an argument for a file to save to. It calls allocateBuffers to setup memory as 1 segment, then calls runStreaming with desired downsampling and a simple callback that just saves the indices of the latest data received to the picoscope object (ie. self._strmStartI, self._strmEndI) etc.
Add another function getStreamingValuesSimple() which the user calls in a loop. If the streaming was started with runStreamingSimple, then this function uses the saved indices to return the latest data (or empty array if none), and will optionally append it to a file.

I think I have most of the above already. If you think it sounds sensible I'll aim to collect it in concrete and testable form.

On Wednesday, June 21, 2017, colinoflynn <notifications@github.com javascript:_e(%7B%7D,'cvml','notifications@github.com');> wrote:

Sounds interesting indeed! Would be great to get support (I have no problem with numpy either... anyone using picopython is probably going to need it anyway for further processing anyway).

If it's going to be a lot of work (or hacking/changes in the main codebase), an option might be to either integrate it under just an example, OR even reference your project within our documentation? At least so someone else who is trying to implement the same (or similar) work doesn't miss that you've already put a ton of effort into this!

Heck even without cleanup we could just point to it from the documentation to avoid the risk of further sidetracking ;-) I know myself I've got loads of partially done code, and not enough time to publish all of it in "proper" projects, but it's good enough for a quick demo which might give someone a starting point.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/colinoflynn/pico-python/issues/23#issuecomment-310070717, or mute the thread https://github.com/notifications/unsubscribe-auth/ADRktVzfmv1_Y4TiP9npsG3YGJFIJDmRks5sGRPcgaJpZM4BZrI1 .

al6675 commented 7 years ago

I am trying to perform streaming access on a Picoscope 4424, after porting the morgatron branch from Pico5000a to Pico 4000 family. After successfully executing the "just_get_connected.py" example, I tried the "test_stream_simplest.py". Unfortunately I have no success so far, since the application crashes after entering the callback function for the first time.

In the (supplied) ps5000a.py file, _getSimpleCallback(self) is defined, which references "streamCallBackRes": the later is not declared anywhere. How and where should it be declared?

Furthermore, when isolating "streamCallBackRes" in order to debug the callback code, I get an error message, that (after going through the relevant function calls) concludes with: "ValueError: Procedure probably called with not enough arguments (18789124 bytes missing)". I suppose that the missing 18+ megabytes are due to the buffer that has not been read. However it is not clear to me how this should happen.

Could you help on these? Thank you

morgatron commented 7 years ago

Hi Al6675, Sorry for the slow reply. Did you have any luck, or give up? There's lots of things wrong with my streaming stuff, but I can't immediately see why that example code (minus the ps5000a specific stuff) shouldn't work for your scope.

Incidentally, that getSimpleCallback method is a bit of a red herring. I had in mind using it in an example, but never finished. It shouldn't ever be called from anywhere.

al6675 commented 7 years ago

Hi, as I was tight on time, I resorted to C which worked without any issues. If I have the chance I will try once again with python. As far as I can tell, the problem is that after executing the callback the code does not return to the loop it was executing before, and is lost somehow.

I have also contacted picotech support on this, and will try with their newly released SDK.

colinoflynn / pico-python

Bulk Handling #23