lancaster-university / codal-microbit-v2

CODAL target for the micro:bit v2.x series of devices
MIT License
43 stars 52 forks source link

Suggest adding `StreamSplitter::pullInto()` to reduce memory allocation when recording samples #395

Closed dpgeorge closed 7 months ago

dpgeorge commented 10 months ago

When recording a stream of audio from the microphone using StreamSplitter, there is potentially an unnecessary allocating of intermediate buffer space, and unnecessary copying of data.

In MicroPython, to record audio one first creates a big buffer to hold the entire sample (eg 5 seconds) and then records into that buffer. It would be convenient and more efficient if CODAL could be told "record into this buffer directly".

With the current architecture of streaming and pullRequest() methods and different consumers with different sample rates, it might be difficult to add the functionality for a zero-copy interface to record samples. So instead I suggest adding a method StreamSplitter::pullInto(uint8_t *buf, size_t len). This is the same as StreamSplitter::pull() except that it uses the given buffer instead of allocating the outData buffer and returning that. This will eliminate a memory allocation for each call to this function, and also eliminate a subsequent memcpy().

JohnVidler commented 10 months ago

I think this is a pretty good suggestion for cases where 'userland' runtimes want more direct C-alike access to memory from a stream. To this end, I plan to roll this in with the changes required to support/fix #394 and #321 and squash all three bugs at once.

Unfortunately I'm on leave for the next week, and will be back in the office on Monday 27th, so I'll pick this up then as a priority.

JohnVidler commented 10 months ago

Related: #365 and #284

JohnVidler commented 10 months ago

Note: Using this issue to track the discussion for implementation to address all the linked issues here.

JohnVidler commented 7 months ago

With the changes currently pending merge to address the sample rate change issues I've also now included a uint8_t * SplitterChannel::pullInto( uint8_t * rawBuffer, int length ) method as part of the SplitterChannel.

This lets you supply an arbitrary block of data, and have the benefits of automatic resampling along with keeping the 'contract' between stream classes to keep the microphone auto-wake functions working. The return pointer is set to the next byte in memory if you're using a large buffer with multiple samples stored sequentially, so by tracking the return between calls you can automatically wander the larger buffer.

If the supplied memory is smaller than required, it will be completely filled and the data will be truncated; if the memory is longer than required, the complete upstream buffer will be copied in and the rest of the memory will be untouched.

Thanks for everyone's patience on this issue, its hit quite a few things at once :)

dpgeorge commented 7 months ago

if the memory is longer than required, the complete upstream buffer will be copied in and the rest of the memory will be untouched.

Testing out this new pullInto() method, I'm not sure the above behaviour is implemented correctly. Passing in a large buffer, it seems to just keep on sampling the input buffer past its end, and only stops once it has filled the given output buffer. This means that you have to request exactly the number of bytes that are available, and it's not really possible to know this.

In the code you can see that the loop to copy the samples from input to output uses while( outPtr - output < length ), and if a buffer is passed in then length is just taken as the passed-in length. The following patch gets it working for me:

--- a/source/streams/StreamSplitter.cpp
+++ b/source/streams/StreamSplitter.cpp
@@ -74,6 +74,10 @@ ManagedBuffer SplitterChannel::resample( ManagedBuffer _in, uint8_t * buffer, in
     if( output == NULL ) {
         output = (uint8_t *)malloc( samplesPerOut * bytesPerSample );
         length = samplesPerOut * bytesPerSample;
+    } else {
+        if (length > samplesPerOut * bytesPerSample) {
+            length = samplesPerOut * bytesPerSample;
+        }
     }

     int oversample_offset = 0;
JohnVidler commented 7 months ago

Testing out this new pullInto() method, I'm not sure the above behaviour is implemented correctly. Passing in a large buffer, it seems to just keep on sampling the input buffer past its end, and only stops once it has filled the given output buffer.

Ah! Good catch! The pullInto() method was less completely tested (I don't have a use case for this inside CODAL itself) so I missed that one. I'll merge your proposed patch into master and can you let me know if everything then works with Python when built against it all? If so, we'll call this the next minor rev with a couple of other smaller merges.

JohnVidler commented 7 months ago

Aaand patched, pushed, tagged and released. Might as well get this out the door ASAP so you have everything you need for Python @dpgeorge 👍

For ref: https://github.com/lancaster-university/codal-microbit-v2/releases/tag/v0.2.66

dpgeorge commented 7 months ago

Thanks for the quick patch and release! I have tested it and it works. I have updated all our code to use the new v0.2.66 release.