igorski / MWEngine

Audio engine and DSP library for Android, written in C++ providing low latency performance within a musical context, while providing a Java/Kotlin API. Supports both OpenSL and AAudio.
MIT License
257 stars 45 forks source link

A/libc: Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) #142

Closed scar20 closed 2 years ago

scar20 commented 2 years ago

Mostly when I stop playing a vector of SampleEvent <32> I don't know if you have looked at the app link I've sent you, but it show up either when you stop metronome of stop one shots events (after started up many).

Here I assumed that there is no harm to do event.start() repeatedly since it only reset the pointers to the start. I assumed also that there is no harm to call event.stop() if not started since it do nothing in that case. I may be wrong.

In any case, I've build another test app to see if it was the metronome at fault (SingleThreadScheduledExecutor) calling itself recursively, or if it was something else. That's the purpose of one shot and one shot stop buttons - manually duplicate what metronome do. Once in a while, stopping the events cause a crash - you have to be patient (and do a lot of finger tapping) before it can happen, but sure enough, it will, unfortunately... (after 20min of tapping I though it was ok, but then boom). As a last resort, I tried different "for" loop - I was using for(SampleEvent ev : sampleEvents) ev.stop(); then change for a standard for i loop. It gave me the impression of being more stable but still chrash so must not be that either. I've looked up Google for the assert error, look frightening...

And also, it can do worst that that, it can freeze totally the device instead of crashing - device completely unresponsive, have to hold the on/off button for 10-20 sec to provoque a restart - but this happen more rarely.

I've put the test app on Github. It is minimal; selection of two sample, set to one vector of samples set to one SampleInstrument that can be started and stopped either by metronome or from the one shot button. There is a counter that increment at each event.get(count).start() up to modulo vector size. There is also a reverb switch that attach or detach reverb on the processing chain of SampleInstrument (because it can induce something that I noted with another device, but I keep that for another thread for which I'll build another test). Simply dismiss it and let the switch off.

I've collected the end of the stack trace with different situation - either stopped by metro off or by manual tapping from the model or the test app but the result look always most the same:

E/libc++abi: terminating with uncaught exception of type std::out_of_range: vector A/libc: Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 11251 (AAudio_2), pid 11203 (e.mwenginetest5) D/MWENGINE: MWEngineActivity::onWindowFocusChanged, has focus: false V/MWENGINE: STOPPING engine STOPPED engine D/AAudio: AAudioStream_requestStop(s#3) called


tapping E/libc++abi: terminating with uncaught exception of type std::out_of_range: vector A/libc: Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 9721 (Thread-3), pid 9661 (mwfonofonemodel)


metro on 1000bpm - let run - then metro stop E/libc++abi: terminating with uncaught exception of type std::out_of_range: vector A/libc: Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 18245 (AAudio_3), pid 18105 (e.mwenginetest5) D/MWENGINE: MWEngineActivity::onWindowFocusChanged, has focus: false V/MWENGINE: STOPPING engine V/MWENGINE: STOPPED engine D/AAudio: AAudioStream_requestStop(s#5) called


metro stop E/libc++abi: terminating with uncaught exception of type std::out_of_range: vector A/libc: Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 26249 (Thread-2), pid 26210 (mwfonofonemodel)


metro stop fori loop ev.stop E/libc++abi: terminating with uncaught exception of type std::out_of_range: vector A/libc: Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 28846 (AAudio_2), pid 28795 (e.mwenginetest5


tapping & stop foreach loop ev.stop E/libc++abi: terminating with uncaught exception of type std::out_of_range: vector A/libc: Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 29744 (Thread-2), pid 29671 (e.mwenginetest5)


Google Pixel3a tapping & stop fori loop ev.stop E/libc++abi: terminating with uncaught exception of type std::out_of_range: vector A/libc: Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 6381 (Thread-2), pid 6324 (e.mwenginetest5)

scar20 commented 2 years ago

I've managed to get more information from the crash with tombstone - I'm new to those commands so don't know really what to look for. Just for the sake of it, I've make a "i <= maxSampleCount" in the for loop of the stop button to provoque a crash and look at the reading; it was different, very verbose with plenty of java method references to click to. I figured there is nothing wrong with the vector or the counter per se.

Now I wonder given the info below, if the fact that I did not AudioChannel.addLiveEvent(ev); or AudioChannel.setHaveLiveEvent(bool); and AudioChannel.setLiveEvents(<arrayIDontKnowHowToObtain>); have something to do with it. I though they were designed to work in conjunction with the sequencer but perhaps those need to be set anyway to work properly even without sequencer.

With the tombstone, we have more information on where the crash occurs; at first glance, it look like MWEngine doing something with an array of BaseAudioEvent (line #8) just before hitting out_of_range. Here's the topmost part of the file, I can send the full bugreport.zip if needed. It crashed switching on/off the metronome:

*** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
Build fingerprint: 'google/bonito/bonito:11/RP1A.201005.004/6782484:user/release-keys'
Revision: 'MP1.0'
ABI: 'arm64'
Timestamp: 2022-01-09 01:40:41-0500
pid: 8956, tid: 8990, name: Thread-2  >>> com.scarette.mwenginetest5 <<<
uid: 10222
signal 6 (SIGABRT), code -1 (SI_QUEUE), fault addr --------
Abort message: 'terminating with uncaught exception of type std::out_of_range: vector'
    x0  0000000000000000  x1  000000000000231e  x2  0000000000000006  x3  000000743b932fa0
    x4  fefeff716e736264  x5  fefeff716e736264  x6  fefeff716e736264  x7  7f7f7f7f7f7f7f7f
    x8  00000000000000f0  x9  caeb761a4b114812  x10 0000000000000000  x11 ffffffc0fffffbdf
    x12 0000000000000001  x13 0000000000000018  x14 0000000817031c48  x15 0025e63edb38cf82
    x16 000000773dfd4c80  x17 000000773dfb63b0  x18 000000743b0e4000  x19 00000000000022fc
    x20 000000000000231e  x21 00000000ffffffff  x22 ffffff80ffffffc8  x23 000000743b9331f0
    x24 000000743b9330d0  x25 000000743b933110  x26 00000074a27eccd0  x27 00000000000fc000
    x28 000000743b83b000  x29 000000743b933020
    lr  000000773df69e20  sp  000000743b932f80  pc  000000773df69e4c  pst 0000000000000000

backtrace:
      #00 pc 000000000004de4c  /apex/com.android.runtime/lib64/bionic/libc.so (abort+164) (BuildId: 03452a4a418e14ff93948f26561eace6)
      #01 pc 0000000000167c74  /data/app/~~bAcHqXE1CVyMX6JoQb8sFg==/com.scarette.mwenginetest5-6n3Q4-trsiGqEsQnIIHFiA==/lib/arm64/libmwengine_wrapped.so (BuildId: 7165ca51eb53022122249fed46197d888ebbb735)
      #02 pc 0000000000167dcc  /data/app/~~bAcHqXE1CVyMX6JoQb8sFg==/com.scarette.mwenginetest5-6n3Q4-trsiGqEsQnIIHFiA==/lib/arm64/libmwengine_wrapped.so (BuildId: 7165ca51eb53022122249fed46197d888ebbb735)
      #03 pc 0000000000164cbc  /data/app/~~bAcHqXE1CVyMX6JoQb8sFg==/com.scarette.mwenginetest5-6n3Q4-trsiGqEsQnIIHFiA==/lib/arm64/libmwengine_wrapped.so (BuildId: 7165ca51eb53022122249fed46197d888ebbb735)
      #04 pc 00000000001642e8  /data/app/~~bAcHqXE1CVyMX6JoQb8sFg==/com.scarette.mwenginetest5-6n3Q4-trsiGqEsQnIIHFiA==/lib/arm64/libmwengine_wrapped.so (BuildId: 7165ca51eb53022122249fed46197d888ebbb735)
      #05 pc 0000000000164244  /data/app/~~bAcHqXE1CVyMX6JoQb8sFg==/com.scarette.mwenginetest5-6n3Q4-trsiGqEsQnIIHFiA==/lib/arm64/libmwengine_wrapped.so (__cxa_throw+112) (BuildId: 7165ca51eb53022122249fed46197d888ebbb735)
      #06 pc 00000000000c3c10  /data/app/~~bAcHqXE1CVyMX6JoQb8sFg==/com.scarette.mwenginetest5-6n3Q4-trsiGqEsQnIIHFiA==/lib/arm64/libmwengine_wrapped.so (BuildId: 7165ca51eb53022122249fed46197d888ebbb735)
      #07 pc 00000000000c3bcc  /data/app/~~bAcHqXE1CVyMX6JoQb8sFg==/com.scarette.mwenginetest5-6n3Q4-trsiGqEsQnIIHFiA==/lib/arm64/libmwengine_wrapped.so (std::__ndk1::__vector_base_common<true>::__throw_out_of_range() const+28) (BuildId: 7165ca51eb53022122249fed46197d888ebbb735)
      #08 pc 00000000000ce4d8  /data/app/~~bAcHqXE1CVyMX6JoQb8sFg==/com.scarette.mwenginetest5-6n3Q4-trsiGqEsQnIIHFiA==/lib/arm64/libmwengine_wrapped.so (std::__ndk1::vector<MWEngine::BaseAudioEvent*, std::__ndk1::allocator<MWEngine::BaseAudioEvent*> >::at(unsigned long)+64) (BuildId: 7165ca51eb53022122249fed46197d888ebbb735)
      #09 pc 00000000000ce280  /data/app/~~bAcHqXE1CVyMX6JoQb8sFg==/com.scarette.mwenginetest5-6n3Q4-trsiGqEsQnIIHFiA==/lib/arm64/libmwengine_wrapped.so (MWEngine::Sequencer::collectLiveEvents(MWEngine::BaseInstrument*)+128) (BuildId: 7165ca51eb53022122249fed46197d888ebbb735)
      #10 pc 00000000000cde50  /data/app/~~bAcHqXE1CVyMX6JoQb8sFg==/com.scarette.mwenginetest5-6n3Q4-trsiGqEsQnIIHFiA==/lib/arm64/libmwengine_wrapped.so (MWEngine::Sequencer::getAudioEvents(std::__ndk1::vector<MWEngine::AudioChannel*, std::__ndk1::allocator<MWEngine::AudioChannel*> >*, int, int, bool, bool)+480) (BuildId: 7165ca51eb53022122249fed46197d888ebbb735)
      #11 pc 00000000000c1fa8  /data/app/~~bAcHqXE1CVyMX6JoQb8sFg==/com.scarette.mwenginetest5-6n3Q4-trsiGqEsQnIIHFiA==/lib/arm64/libmwengine_wrapped.so (MWEngine::AudioEngine::render(int)+460) (BuildId: 7165ca51eb53022122249fed46197d888ebbb735)
      #12 pc 00000000000d23f4  /data/app/~~bAcHqXE1CVyMX6JoQb8sFg==/com.scarette.mwenginetest5-6n3Q4-trsiGqEsQnIIHFiA==/lib/arm64/libmwengine_wrapped.so (MWEngine::AAudio_IO::dataCallback(MWEngine::AAudio::AAudioStreamStruct*, void*, int)+704) (BuildId: 7165ca51eb53022122249fed46197d888ebbb735)
      #13 pc 00000000000d2124  /data/app/~~bAcHqXE1CVyMX6JoQb8sFg==/com.scarette.mwenginetest5-6n3Q4-trsiGqEsQnIIHFiA==/lib/arm64/libmwengine_wrapped.so (MWEngine::dataCallback(MWEngine::AAudio::AAudioStreamStruct*, void*, void*, int)+128) (BuildId: 7165ca51eb53022122249fed46197d888ebbb735)
      #14 pc 0000000000022364  /system/lib64/libaaudio_internal.so (aaudio::AudioStream::maybeCallDataCallback(void*, int)+192) (BuildId: 572fc676500947b5f7300082d299c641)
      #15 pc 000000000003476c  /system/lib64/libaaudio_internal.so (aaudio::AudioStreamInternalPlay::callbackLoop()+360) (BuildId: 572fc676500947b5f7300082d299c641)
      #16 pc 000000000001fcf8  /system/lib64/libaaudio_internal.so (aaudio::AudioStream_internalThreadProc(void*) (.cfi)+204) (BuildId: 572fc676500947b5f7300082d299c641)
      #17 pc 00000000000af888  /apex/com.android.runtime/lib64/bionic/libc.so (__pthread_start(void*)+64) (BuildId: 03452a4a418e14ff93948f26561eace6)
      #18 pc 000000000004fe08  /apex/com.android.runtime/lib64/bionic/libc.so (__start_thread+64) (BuildId: 03452a4a418e14ff93948f26561eace6)
scar20 commented 2 years ago

I just tried with channel.setLiveEvents(_sampler.getEvents()); and channel.setHaveLiveEvent(bool); set. Same result. The tombstone is identical - I've looked at others also; its always at the same place. From metro on/off (many, many, many many times).

Oh, and I didn't notice but the tombstone are not in order, so the previous extract I've sent was not the latest. But still, the same app with the same crash. Below is the latest one I just crashed.

*** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
Build fingerprint: 'google/bonito/bonito:11/RP1A.201005.004/6782484:user/release-keys'
Revision: 'MP1.0'
ABI: 'arm64'
Timestamp: 2022-01-11 06:17:08-0500
pid: 6580, tid: 6619, name: Thread-2  >>> com.scarette.mwenginetest5 <<<
uid: 10222
signal 6 (SIGABRT), code -1 (SI_QUEUE), fault addr --------
Abort message: 'terminating with uncaught exception of type std::out_of_range: vector'
    x0  0000000000000000  x1  00000000000019db  x2  0000000000000006  x3  000000747032cfa0
    x4  fefeff716e736264  x5  fefeff716e736264  x6  fefeff716e736264  x7  7f7f7f7f7f7f7f7f
    x8  00000000000000f0  x9  22cc0fdc350e31d8  x10 0000000000000000  x11 ffffffc0fffffbdf
    x12 0000000000000001  x13 0000000000000018  x14 0000031c17985410  x15 0035925a1cc18fa5
    x16 0000007772b22c80  x17 0000007772b043b0  x18 000000747313a000  x19 00000000000019b4
    x20 00000000000019db  x21 00000000ffffffff  x22 ffffff80ffffffc8  x23 000000747032d1f0
    x24 000000747032d0d0  x25 000000747032d110  x26 00000074d6210cd0  x27 00000000000fc000
    x28 0000007470235000  x29 000000747032d020
    lr  0000007772ab7e20  sp  000000747032cf80  pc  0000007772ab7e4c  pst 0000000000000000

backtrace:
      #00 pc 000000000004de4c  /apex/com.android.runtime/lib64/bionic/libc.so (abort+164) (BuildId: 03452a4a418e14ff93948f26561eace6)
      #01 pc 0000000000167c74  /data/app/~~Dxde2Me9RO5I84k_OVqKBg==/com.scarette.mwenginetest5-PZASxGZAwqpCisnOCPZTww==/lib/arm64/libmwengine_wrapped.so (BuildId: 7165ca51eb53022122249fed46197d888ebbb735)
      #02 pc 0000000000167dcc  /data/app/~~Dxde2Me9RO5I84k_OVqKBg==/com.scarette.mwenginetest5-PZASxGZAwqpCisnOCPZTww==/lib/arm64/libmwengine_wrapped.so (BuildId: 7165ca51eb53022122249fed46197d888ebbb735)
      #03 pc 0000000000164cbc  /data/app/~~Dxde2Me9RO5I84k_OVqKBg==/com.scarette.mwenginetest5-PZASxGZAwqpCisnOCPZTww==/lib/arm64/libmwengine_wrapped.so (BuildId: 7165ca51eb53022122249fed46197d888ebbb735)
      #04 pc 00000000001642e8  /data/app/~~Dxde2Me9RO5I84k_OVqKBg==/com.scarette.mwenginetest5-PZASxGZAwqpCisnOCPZTww==/lib/arm64/libmwengine_wrapped.so (BuildId: 7165ca51eb53022122249fed46197d888ebbb735)
      #05 pc 0000000000164244  /data/app/~~Dxde2Me9RO5I84k_OVqKBg==/com.scarette.mwenginetest5-PZASxGZAwqpCisnOCPZTww==/lib/arm64/libmwengine_wrapped.so (__cxa_throw+112) (BuildId: 7165ca51eb53022122249fed46197d888ebbb735)
      #06 pc 00000000000c3c10  /data/app/~~Dxde2Me9RO5I84k_OVqKBg==/com.scarette.mwenginetest5-PZASxGZAwqpCisnOCPZTww==/lib/arm64/libmwengine_wrapped.so (BuildId: 7165ca51eb53022122249fed46197d888ebbb735)
      #07 pc 00000000000c3bcc  /data/app/~~Dxde2Me9RO5I84k_OVqKBg==/com.scarette.mwenginetest5-PZASxGZAwqpCisnOCPZTww==/lib/arm64/libmwengine_wrapped.so (std::__ndk1::__vector_base_common<true>::__throw_out_of_range() const+28) (BuildId: 7165ca51eb53022122249fed46197d888ebbb735)
      #08 pc 00000000000ce4d8  /data/app/~~Dxde2Me9RO5I84k_OVqKBg==/com.scarette.mwenginetest5-PZASxGZAwqpCisnOCPZTww==/lib/arm64/libmwengine_wrapped.so (std::__ndk1::vector<MWEngine::BaseAudioEvent*, std::__ndk1::allocator<MWEngine::BaseAudioEvent*> >::at(unsigned long)+64) (BuildId: 7165ca51eb53022122249fed46197d888ebbb735)
      #09 pc 00000000000ce280  /data/app/~~Dxde2Me9RO5I84k_OVqKBg==/com.scarette.mwenginetest5-PZASxGZAwqpCisnOCPZTww==/lib/arm64/libmwengine_wrapped.so (MWEngine::Sequencer::collectLiveEvents(MWEngine::BaseInstrument*)+128) (BuildId: 7165ca51eb53022122249fed46197d888ebbb735)
      #10 pc 00000000000cde50  /data/app/~~Dxde2Me9RO5I84k_OVqKBg==/com.scarette.mwenginetest5-PZASxGZAwqpCisnOCPZTww==/lib/arm64/libmwengine_wrapped.so (MWEngine::Sequencer::getAudioEvents(std::__ndk1::vector<MWEngine::AudioChannel*, std::__ndk1::allocator<MWEngine::AudioChannel*> >*, int, int, bool, bool)+480) (BuildId: 7165ca51eb53022122249fed46197d888ebbb735)
      #11 pc 00000000000c1fa8  /data/app/~~Dxde2Me9RO5I84k_OVqKBg==/com.scarette.mwenginetest5-PZASxGZAwqpCisnOCPZTww==/lib/arm64/libmwengine_wrapped.so (MWEngine::AudioEngine::render(int)+460) (BuildId: 7165ca51eb53022122249fed46197d888ebbb735)
      #12 pc 00000000000d23f4  /data/app/~~Dxde2Me9RO5I84k_OVqKBg==/com.scarette.mwenginetest5-PZASxGZAwqpCisnOCPZTww==/lib/arm64/libmwengine_wrapped.so (MWEngine::AAudio_IO::dataCallback(MWEngine::AAudio::AAudioStreamStruct*, void*, int)+704) (BuildId: 7165ca51eb53022122249fed46197d888ebbb735)
      #13 pc 00000000000d2124  /data/app/~~Dxde2Me9RO5I84k_OVqKBg==/com.scarette.mwenginetest5-PZASxGZAwqpCisnOCPZTww==/lib/arm64/libmwengine_wrapped.so (MWEngine::dataCallback(MWEngine::AAudio::AAudioStreamStruct*, void*, void*, int)+128) (BuildId: 7165ca51eb53022122249fed46197d888ebbb735)
      #14 pc 0000000000022364  /system/lib64/libaaudio_internal.so (aaudio::AudioStream::maybeCallDataCallback(void*, int)+192) (BuildId: 572fc676500947b5f7300082d299c641)
      #15 pc 000000000003476c  /system/lib64/libaaudio_internal.so (aaudio::AudioStreamInternalPlay::callbackLoop()+360) (BuildId: 572fc676500947b5f7300082d299c641)
      #16 pc 000000000001fcf8  /system/lib64/libaaudio_internal.so (aaudio::AudioStream_internalThreadProc(void*) (.cfi)+204) (BuildId: 572fc676500947b5f7300082d299c641)
      #17 pc 00000000000af888  /apex/com.android.runtime/lib64/bionic/libc.so (__pthread_start(void*)+64) (BuildId: 03452a4a418e14ff93948f26561eace6)
      #18 pc 000000000004fe08  /apex/com.android.runtime/lib64/bionic/libc.so (__start_thread+64) (BuildId: 03452a4a418e14ff93948f26561eace6)
scar20 commented 2 years ago

It look like it happen in

void Sequencer::collectLiveEvents( BaseInstrument* instrument )
{
    AudioChannel* channel = instrument->audioChannel;

    instrument->toggleReadLock( true ); // lock the events vector while sequencing
    std::vector<BaseAudioEvent*>* liveEvents = instrument->getLiveEvents();

Seems at some point instrument->getLiveEvents(); get screwed up. What I would like to do is keep an eye on that vector size when metro switches off or stop button is pressed - which on the java side only call event.stop() on all events in the (java)vector. I don't know how to do that :(

Also I'm a bit confused about who own what here; AudioChannel and BaseInstrument both have a liveEvent list. Which one need to be set (if necessary)? First I didn't set any. Now I set in channel. Should I set in instrument instead, or both?

igorski commented 2 years ago

Your tomb stones helped a lot!

Can you pull and merge the latest code as I addressed a possible solution in 9451c4cd50942a520414f1757e4253282588f3c5 ?

It's an intermediate step as you raised a good point: why are there two vectors between the instrument and audio channel ? The reason is that the instrument maintains events that should be played by the instrument (makes sense) and that the audio channel maintains a list of currently audible events that should render their output in the render iteration of the engine (makes sense... but maybe not in the right place...)

scar20 commented 2 years ago

Yes!

I just spent one hour trying to make it crash tapping in a frenzy on top of metronome and stopping randomly. Stopped every time. No crash at all! I think we can assume now its rock solid. I saw that you check now liveEvents->size() inside the loop instead of having it set before the loop to catch up an event.stop() that could occur while performing the loop. It seems that have done the trick. I will have to study this (important) part of the engine to have a better understanding of the render loop. I didn't set any liveEvent this time, just let the engine do its own thing with them as it seem to work well that way. But let me know if I should.

Thanks, I would hope that Google is as fast than you... Bug solved, I let you close this thread if nothing more to add.

igorski commented 2 years ago

Cool!

I made an additional change though, which I've committed to the repository in commit c0ed3c0f2bb4d29b5ce52a6ce4f67e354d5c6be2

Previously a mutex lock was used to prevent mutations to the event vectors while the sequencer was reading during a render iteration. You've spotted that this was broken for live events. The engine uses the isDeletable()-flag for audio events to indicate that they should be removed from the instrument after the sequencer finishes an iteration.

I made the live and sequenced events disposal routines equal and managed to ditch the thread locking mechanism. You can continue to use stop() or removeFromSequencer() and the events should be cleaned up after the Sequencer has finished a read operation, causing no conflicts.

scar20 commented 2 years ago

I have applied the changes. Tested, but not one hour long... It stop as it should. But... at one point, it froze solid, which is a bit unsettling. I had added more samples to the vector - 48 instead of 32. Safely put it back to 32 now. Always build from clean.

Circumstance; messing with first sample on/off metro random speed but mostly max, stop, load the other sample, start metro still max and it became unresponsive. Now letting it play while writing this and it seem OK.

Now, I really do not know what to do for tracing a froze. I collected a bugreport and looked for "am_anr" in a text editor with no result. So here the zip file in case you may find it useful. Let me know what I could do or try to get more informations. bugreport-bonito-RP1A.201005.004-2022-01-14-03-55-10.zip .

igorski commented 2 years ago

Uh, crap. I just had to ruin it. 🙈

Just to be clear: you had added more samples. Is that something you did previously as well (as in: prior to the first fix posted here?). I'm asking just to be sure that this issue is new and a result from the latest changes, or perhaps has been dormant all along..

With freeze you mean that the applications hangs, but does the entire device hang ? As in: you can't close the app and return to the OS / homescreen ?

scar20 commented 2 years ago

Yea, for the freeze, I mean it freeze the device, you have to wait until the device restart by itself or press the on button for 10-20 sec until restart.

For the sample numbers, when I tested for one hour, I had set 48, normally it is set to 32. I tried 64 but at that point there could be distortion - there is no strategy outside a limiter to handle the cumulative effect of the samples. Anyhow, the freeze happened even before you made the change, but it was more rare. When I tested for one hour, I though it was gone - I even forget it had happen. So not a new thing. It is very rare but still... Now, how do you debug a freeze??? I tend to look toward the SingleThreadScheduledExecutor in a recursive configuration that run the metronome even though it look safe - I may be wrong. Perhaps the escape condition is not strong enough.

igorski commented 2 years ago

Alright, found another edge case that could cause a live event enqueued for removal (through stop()) to be re-added to the playback vector (through play()) causing a double addition to the playback list, where only one instance would be removed from the list (by the pending removal triggered by the initialstop(), leading the event instance itself to reset its state). Addressed this in 950473c8c102ef1d7a8bfa1d3d17bfae78ccde81

It would be an extreme coincidence if the above happens fast enough (e.g. within a single render iteration of the engine), but its certainly not impossible with repeated invocation of stop/play in rapid succession.

scar20 commented 2 years ago

I have rebuild with the latest changes. Wow, that was subtle - so subtle that I don't know if I really understand what you have done. I will have to study in more details that enqueueRemoval( true ); thing. But thanks a lot for your provided brain juice!

Now, its hard to tell if will really solve the freeze since that was so rare, but given your explanations, I'll be confident. I just revamped my MWEngineTest5 to include the changes and added an "auto metro" switch, which act as a human switching on and off for 700ms + random*1000 each time varying the speed between 900 and 1000bpm to try to get "in the cracks". It run at the moment and I will let it run for a day (a bit annoying, but less than switching manually) while I'm trying to compile libsndfile for android. I use the SampleEventRange version but let me know if you want I check with the "standard" one. The code is at https://github.com/scar20/MWEngineTest5

scar20 commented 2 years ago

I let run MWEngineTest5 for 24h now and it still run with no crash or freeze. Great! I don't know if it can be called a success but it look close. Let me know if I should try to make the test more stringent - going over 1000bps, not leaving 10ms between stop and start, or others conditions to push it more to the edge. I'll end the test now since getting tired of hearing it buzzing in my ears.

The only thing I noticed is that mysterious "screech" distortion (sound like ring-mod) that happen once in a while for about a minute than disappear. I didn't mention it before since I can't figure the exact condition it appear. This is a new case. Case it appear is: the app is debug app - in release everything is fine, never showed up. if running a MWEngine app(debug) while there is others MWEngine app(debug) stopped in background. Not the case here, no other app in background. if loading a second sample after the initial default - mostly with SampleEventRange. if loading a second sample after the initial default and there is effects in the chain (reverbSm) and this new case every couple of hour it appear for a minute or so. This deserve its own thread once I can gather a more precise picture - but something interfere with the engine.

igorski commented 2 years ago

I let run MWEngineTest5 for 24h now and it still run with no crash or freeze

Awesome! And thanks for the thorough test. "There was this person that dedicated a full day to verify something, how cool is that?" 🏆

The screech sounds interesting. Maybe is the reverb as it could potentially self oscillate since it does use comb filters, but that wouldn't explain how it resolves itself. Maybe there is something happening in the background that interferes with the AAudio/audio mixer at the OS level ? During the screeching, is the sound you are supposed to hear still audible ? (as in: not chopped up or glitchy, but just very loud as you put it?).

At any case, I agree we can close this issue and when you have more information / another occurrence of the screeching we can treat that separately.