sandrohanea / whisper.net

Whisper.net. Speech to text made simple using Whisper Models
MIT License
534 stars 82 forks source link

Process terminated. A callback was made on a garbage collected delegate of type 'Whisper.net!Whisper.net.Native.WhisperNewSegmentCallback::Invoke' #12

Closed GewoonJaap closed 1 year ago

GewoonJaap commented 1 year ago

afbeelding

Process terminated. A callback was made on a garbage collected delegate of type 'Whisper.net!Whisper.net.Native.WhisperNewSegmentCallback::Invoke'. Repeat 2 times:

at Whisper.net.Native.NativeMethods.whisper_full(IntPtr, Whisper.net.Native.WhisperFullParams, IntPtr, Int32)

at Whisper.net.WhisperProcessor.Process(System.IO.Stream) at WhisperAI.AudioProcessor+d1.MoveNext() at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[System.Canon, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](System.Canon ByRef) at WhisperAI.AudioProcessor.ProcessAudio(System.String) at Program+<

$>d0.MoveNext() at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[System.Canon, System.Private.CoreLib, Version=7.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](System.Canon ByRef) at Program.
$(System.String[]) at Program.
(System.String[])

This error happens on big and long .WAV files (200mb and 40minutes long) It happens very randomly.

sandrohanea commented 1 year ago

Interesting issue :)

I thought I prevented it during https://github.com/sandrohanea/whisper.net/blob/main/Whisper.net/WhisperProcessor.cs#L318 storing of the delegates but the objects can still be moved during Garbage Collection, resulting the function pointer to not point to a valid delegate anymore.

I'll fix this by using a GCHandle in order to ensure that the memory used by the delegate will be pinned.

Thanks for reporting this!

GewoonJaap commented 1 year ago

Thank you!! Do you know if there is a parameter to generate accurate subtitles? As in, if there is silence, Whisper AI won't cut off the text. So it always displays text even if nothing is said

sandrohanea commented 1 year ago

It is implemented but not yet release to include the probability on the SegmentData: https://github.com/sandrohanea/whisper.net/pull/5/commits/98aed0b100c71270912dce091b74942ba528a41f

This way you can ignore anything which is bellow a confidence level defined by you (for your needs).

In the future, you will be able to use: WithNoSpeechThreshold : https://github.com/sandrohanea/whisper.net/blob/main/Whisper.net/WhisperProcessorBuilder.cs#L464 But that's not yet implemented in the underlying whisper.cpp.

GewoonJaap commented 1 year ago

Ah, thanks! Thanks for this amazing package :) You should setup a sponsor page on this GitHub repo 👍

GewoonJaap commented 1 year ago

I just tested your fix from the commit ^^. Seems to work. Could transcribe the entire .WAV file without any problems! Something I noticed, but probably a problem in the .cpp library, sometimes text is being duplicated, which isn't present in the original audio file afbeelding

GewoonJaap commented 1 year ago

Do you know when you release this version? Or is there a way to use this pre-release in my application. I am using the NuGet Package manager.

adamnova commented 1 year ago

I just tested your fix from the commit ^^. Seems to work. Could transcribe the entire .WAV file without any problems! Something I noticed, but probably a problem in the .cpp library, sometimes text is being duplicated, which isn't present in the original audio file afbeelding

Yep this seems to be an issue in the cpp library. See Issue 508 and Issue 471. There is a pre=release version of 1.2.1 but I don't think it contains fix for your problem yet.

sandrohanea commented 1 year ago

I'm curious how big is the confidence on those repetitions (now that we have also confidence level exposed on SegmentData). Maybe it can be removed directly based on the confidence. In the meantime, we released 1.2.1 on nuget which contains the fix of the main issue reported here (the GC delegates): https://www.nuget.org/packages/Whisper.net/1.2.1