Open microbit-carlos opened 1 year ago
Most often the panic is in isReadOnlyInline() from incr() when refCount == 1, implying a buffer being copied has been destroyed. I have also seen a panic from decr(). https://github.com/lancaster-university/codal-core/blob/master/source/types/RefCounted.cpp#L52
It doesn’t seem to be heap corruption, but that a RefCounted buffer has been released too many times and destroyed.
I have seen a panic in CodalHeapAllocator when a ref counted buffer was being freed. https://github.com/lancaster-university/codal-core/blob/master/source/core/CodalHeapAllocator.cpp#L378
I have seen SerialStreamer::lastBuffer crash when assigning a new pull because it’s previous ref counted object has been freed.
I think one problem is that the buffer operations are not interrupt safe.
I have analysed one example, built with GNU 12.3.1, involving DataStream::nextBuffer, which seems to show DataStream::pull() interrupted at the wrong instant, for a visit to DataStream::pullRequest(), which leads to a double release of DataStream::nextBuffer. https://github.com/lancaster-university/codal-core/blob/master/source/streams/DataStream.cpp#L160 https://github.com/lancaster-university/codal-core/blob/master/source/streams/DataStream.cpp#L193
A panic occurs as DataStream::pull() tries to return tmp, because its ptr has been released and destroyed.
Line numbers above may be different because of added trace code: https://github.com/martinwork/microbit-v2-samples/tree/test-serialstreamer https://github.com/martinwork/codal-core/tree/test-serialstreamer https://github.com/martinwork/codal-microbit-v2/tree/test-refcounted
Without the trace code additions, it can take a lot longer to panic, but it does eventually.
See the log below. As I understand it...
@JohnVidler I didn't have much luck catching it in the debugger, but I added some DMESG to the relevant functions and captured repeated traces like below.
SerialStreamer::pullRequest seems indirectly to pull the nextBuffer from the same DataStream (at 0x200024C8) that demux writes to,
NRF52ADCChannel::demux calls the ADC channel's non-blocking output DataStream::pullRequest to update its nextBuffer.
NRF52ADCChannel::demux 0x200024B0 output 0x200024C8
DataStream::pullRequest 0x200024C8 DataSink 0x20006118 DataSource 0x200024B0 blocking 0
DataStream::pullRequest 0x200024C8 setting nextBuffer
NRF52ADCChannel::demux EXIT
Next StreamSplitter::pullRequest routes down through SplitterChannel and StreamNormalizer to SerialStreamer::pullRequest.
StreamSplitter::pullRequest
SplitterChannel::pullRequest 0x2000616C output 0x2000618C
StreamNormalizer::pullRequest output 0x200061B0
DataStream::pullRequest 0x200061B0 DataSink 0x20006FC8 DataSource 0x20006190 blocking 1
SerialStreamer::pullRequest
SerialStreamer::pullRequest calls pull on StreamNormalizer::output (a blocking DataStream) which routes via StreamNormalizer, SplitterChannel, and StreamSplitter::getBuffer to the ADC channel's output DataStream:pull, which returns the nextBuffer.
DataStream::pull 0x200061B0 DataSink 0x20006FC8 DataSource 0x20006190 blocking 1
DataStream::pull 0x200061B0 calling upStream->pull()
StreamNormalizer::pull upstream 0x2000616C
SplitterChannel::pull 0x2000616C
StreamSplitter::getBuffer upstream 0x200024C8
DataStream::pull 0x200024C8 DataSink 0x20006118 DataSource 0x200024B0 blocking 0
DataStream::pull 0x200024C8 returning nextBuffer
SplitterChannel::pullRequest 0x200061D8 output 0x00000000
StreamSplitter::pullRequest EXIT
This programme runs for about 1 min 30 s and then we get a MICROBIT_HEAP_ERROR 030 error.