Picovoice / cheetah

On-device streaming speech-to-text engine powered by deep learning
https://picovoice.ai/
Apache License 2.0
593 stars 67 forks source link

Cheetah Issue: Stack overflow crash in Cheetah during processing #191

Closed KieranAC closed 1 year ago

KieranAC commented 1 year ago

I have been integrating and evaluating Cheetah in a MacOS desktop application (Cheetah C library, arm64) and have been experiencing this issue, which is blocker for further development. I have been unable to debug this much further without understanding what could cause this behaviour inside the cheetah library.

Expected behaviour

Transcription should be output

Actual behaviour

App crashes with very large backtrace inside cheetah library

Steps to reproduce the behaviour

Run the transcription for some time (10-15 minutes) eventually app will crash after working correctly during that time.

MacOS 13.0.1 Cheetah C library (latest version) Target is MacOS

As far as I can tell, the buffer passed to pv_cheetah_process is valid, has content, and is accessible. App crashes with EXC_BAD_ACCESS and appears to be a stack overflow type crash.

The app and transcription works perfectly fine for some time before this happens. The crash consistently occurs after around 10 minutes runtime.

I am using AVAudioEngine to tap an input microphone. The buffer provided is converted and then fed into picovoice/cheetah.

Backtrace

(lldb) bt * thread #88, queue = 'RealtimeMessenger.mServiceQueue', stop reason = EXC_BAD_ACCESS (code=2, address=0x8b7a4bfe0) frame #0: 0x0000000102aa86f4 libpv_cheetah.dylib ___lldb_unnamed_symbol137 frame #1: 0x0000000102aae508 libpv_cheetah.dylib ___lldb_unnamed_symbol196 + 60 frame #2: 0x0000000102aae520 libpv_cheetah.dylib ___lldb_unnamed_symbol196 + 84 frame #3: 0x0000000102aae520 libpv_cheetah.dylib ___lldb_unnamed_symbol196 + 84

... < Many stack frames omitted > ...

frame #16614: 0x0000000102aae520 libpv_cheetah.dylib ___lldb_unnamed_symbol196 + 84 frame #16615: 0x0000000102aae520 libpv_cheetah.dylib ___lldb_unnamed_symbol196 + 84 frame #16616: 0x0000000102aae1d8 libpv_cheetah.dylib ___lldb_unnamed_symbol194 + 48 frame #16617: 0x0000000102aa98d8 libpv_cheetah.dylib ___lldb_unnamed_symbol142 + 3300 frame #16618: 0x0000000102aa3a30 libpv_cheetah.dylib pv_cheetah_process + 1116

... < Application code below > ...

* frame #16619: 0x0000000100cd1f9c Transcriber -[Transcriber processPCMBuffer:](self=0x00000003d142fb40, _cmd="processPCMBuffer:", pcm=0x000000016900a200) at Transcriber.m:160:26 frame #16620: 0x0000000100cd2348 Transcriber -[Transcriber processPCMBuffer:numSamples:](self=0x00000003d142fb40, _cmd="processPCMBuffer:numSamples:", pcm=0x00000009367db460, numSamples=1486) at Transcriber.m:249:9 frame #16621: 0x000000010025b5b4 Aircover Transcriber.microphoneData(channelData=0x9367db460, numSamples=1486, self=0x00000003b3cce010) at TranscriberManager.swift:202:37 frame #16622: 0x000000010025c5e8 Aircover protocol witness for MicrophoneListener.microphoneData(channelData:numSamples:) in conformance TranscriberManager at <compiler-generated>:0 frame #16623: 0x0000000100212234 Aircover closure #1 in Microphone.setup(tapBuffer=0x000000092a412dd0, time=0x0000000911bcacc0, pcmBuffer=0x00000003d8476060, self=0x00000003f192ffc0) at Microphone.swift:171:32 frame #16624: 0x00000001002125a4 Aircover thunk for @escaping @callee_guaranteed (@guaranteed AVAudioPCMBuffer, @guaranteed AVAudioTime) -> () at <compiler-generated>:0 frame #16625: 0x00000001f35c8c68 AVFAudio AVAudioNodeTap::TapMessage::RealtimeMessenger_Perform() + 1304 frame #16626: 0x00000001f35bfacc AVFAudio CADeprecated::RealtimeMessenger::_PerformPendingMessages() + 96 frame #16627: 0x00000001f35bfa40 AVFAudio invocation function for block in CADeprecated::RealtimeMessenger::RealtimeMessenger(applesauce::dispatch::v1::queue) + 104 frame #16628: 0x0000000102f269d4 libdispatch.dylib _dispatch_client_callout + 20 frame #16629: 0x0000000102f2a2b4 libdispatch.dylib _dispatch_continuation_pop + 816 frame #16630: 0x0000000102f455f0 libdispatch.dylib _dispatch_source_invoke + 1732 frame #16631: 0x0000000102f2fb14 libdispatch.dylib _dispatch_lane_serial_drain + 376 frame #16632: 0x0000000102f30e00 libdispatch.dylib _dispatch_lane_invoke + 484 frame #16633: 0x0000000102f328c8 libdispatch.dylib _dispatch_workloop_invoke + 2876 frame #16634: 0x0000000102f40990 libdispatch.dylib _dispatch_workloop_worker_thread + 1064 frame #16635: 0x000000010313fd28 libsystem_pthread.dylib _pthread_wqthread + 288

ErisMik commented 1 year ago

So far I have been unable to reproduce the error given a few scenarios.

If you are able to record and share the audio input that reliably causes the crash, that can greatly aid in finding a resolution for this error. If you don't want to share the audio publicly you can email it hello@picovoice.ai instead.

What hardware platform are you using (M1/M2/etc.)?

KieranAC commented 1 year ago

I have created a crash report that I will email to you which might help if you can get the symbols

Example stack frame: 52 libpv_cheetah.dylib 0x103e02520 0x103df4000 + 58656

Loaded binary address: 0x103df4000 - 0x103e23fff libpv_cheetah.dylib (*) <98b4c526-1e16-3e8e-8bc2-c3e627e0a858> /Applications/Aircover.app/Contents/Frameworks/Transcriber.framework/Versions/A/Frameworks/arm64/libpv_cheetah.dylib

I am using cheetah + recorder to tap a virtual audio source output. It's not feeding any specific audio in particular, it would be live audio from a voip app.

What hardware platform are you using (M1/M2/etc.)?

2020 MBP 13" M1 16gb w/ Ventura 13.0.1

KieranAC commented 1 year ago

I was able to reproduce this inside a simple test app.

I see the same stack overflow type crash by simply passing in a zero-ed out buffer of int16_t integers to cheetah, with pv_cheetah_frame_length() entries.

Am I fundamentally misunderstanding something, or is there some sort of bug here? I wouldn't expect the library to crash with such input. I sent an email with the sample app.

Following is a boiled-down version:

`

ErisMik commented 1 year ago

Am I fundamentally misunderstanding something, or is there some sort of bug here? I wouldn't expect the library to crash with such input. I sent an email with the sample app.

No, this is a supported case and should not cause a crash. I was able to reproduce the crash given the test app you sent. Investigating the cause now!

ErisMik commented 1 year ago

@KieranAC Thank you for the report and the help with debugging. I've pushed new libs that contain fixes for your issue, they are currently available on the v1.1-patches branch and should be in master soon. Please verify that these new libs fix the crashing you see.

ErisMik commented 1 year ago

The updated libs are now in master (https://github.com/Picovoice/cheetah/pull/192). Closing issue as complete. Please feel free to re-open or create a new issue if you continue to run into problems.