microbit-foundation / micropython-microbit-v2

Temporary home for MicroPython for micro:bit v2 as we stablise it before pushing upstream
MIT License
41 stars 22 forks source link

CODAL heap used for audio recording #156

Closed microbit-carlos closed 3 months ago

microbit-carlos commented 1 year ago

CODAL allocates audio buffers in chunks in the heap during audio recording from the microphone. MicroPython reserves the Python code heap with a 64 KB static array, and configures the MicroPython/CODAL stack to 8 KB.

If I remember correctly, the rest is left for CODAL heap, is that right? I assume the MicroPython core itself (not the codal app port part) doesn't use the CODAL heap, is that correct? In which case, that will mostly be used by CODAL code.

With all that in mind that might leave us around 32 KBs of heap (complete guesstimation, we should measure this), which is not a lot for recording audio.

It's also a bit weird if audio recording doesn't consume "user" memory, as reported by the micropython machine and gc modules.

@dpgeorge @finneyj @JohnVidler what are our options here?

Will we need to somehow allocate recording buffers in the user's Python heap? Is that possible with the current codebase? Will we need to hook custom allocators somewhere?

finneyj commented 1 year ago

Thanks @microbit-carlos. I think you're assumptions above are correct, but @dpgeorge is probably the best to confirm.

I see two options here (it would be great to hear more options if anyone can think of any!)

1) We run the user space CODAL heap inside the micropython memory allocator. We already do this for Makecode, so is a known working approach. Essentially we override the new/malloc/free calls in CODAL that are made in non-interrupt context and map them to micropython memory allocator equivalents. This allows us to then remove such memory fragmentation. The CODAL heap allocator is interrupt safe though, and we need that in a few places ( ADC streaming being one of those ), so we then also maintain a small CODAL heap to cover those cases. I can't remember how big this is for MakeCode, but it's small (a few K).

2) We leave the memory allocators as they are, but move the recording part of the functionality we're talking about over into the micropython side of the fence. This code could then receive the buffers from the CODAL audio stream, and copy them into a buffer allocated from the micropython heap. This is simpler in some ways, and would also work to keep the CODAL heap size small, but is a point solution. We'd also need to reimplement some of the recording and playback code that's already written on the CODAL side (although, it's nothing too complicated). This also adds another copy operation with its associated CPU overhead, which is not ideal, but probably tolerable :)

Thoughts?

dpgeorge commented 1 year ago

All the assumptions above are correct. MicroPython has a fixed 64kB heap, and does not use the CODAL heap for anything except what CODAL uses "behind the scenes" when its API functions/methods are called.

For the audio buffers: allocating and freeing lots of large-ish sized buffers on the MicroPython heap could lead to heap fragmentation and out-of-memory issues, even if there is still enough free memory (due to fragmentation). The way to prevent this is to either (1) have pre-allocated memory for the audio buffers, or (2) have a separate heap which is deterministic and used only for audio buffers. The second option there is basically the same as using the CODAL heap.

How much memory is needed for audio buffers? How much does it allocate each time (ie what are the sizes of the allocations)?

dpgeorge commented 3 months ago

I think this issue can be closed. The audio-recording branch is using MicroPython's GC heap to allocate memory for recording data. The CODAL does not use much, just some temporary internal buffers. This was made even better with pullInto().