sandrohanea / whisper.net

Whisper.net. Speech to text made simple using Whisper Models
MIT License
512 stars 78 forks source link

Issue with Wave Memorystream obtained through ffmpegcore #74

Closed Kanda closed 1 year ago

Kanda commented 1 year ago

I seem to have run into an issue with a memorystream created by letting ffmpegcore download a video and stripping the audio from it. As soon as it arrives at the ProcessAsync call. The application uses almost 10 gigs of memory. Calling Process instead of the async variant leads to "unable to read beyond the end of the stream"

I debugged part of it already and saw that the dataChunkSize seems to be massive compared to the memorystreams length (memory streams length: 888910)(https://github.com/sandrohanea/whisper.net/blob/main/Whisper.net/Wave/WaveParser.cs#L356) When I hardcoded the dataChunkSize to be the length, it read the stream fine and gave me the expected output.

I wondered if you could tell me what might be the issue here. Either by the settings for the wave (unsupported codec or something else) or what might go wrong with reading the created memorystream. I added an example project to this post. (You might need to get the required ffmpeg binaries from https://ffbinaries.com/downloads)

WhisperIssueExample project

sandrohanea commented 1 year ago

Hello @Kanda, Thanks for reporting the issue and also posting the repro.

It seems it is indeed a bug caused by Streamed wave files.

Because Ffmpeg is using the streampipe as destination => it will just write uint.MaxValue for datachunk and then write all the data. In general, if this was written in a file, ffmpeg would come back with Seek and update the data chunk => but it cannot do this with pipes => so the loaded MemoryStream, looks like a streaming Wave.

Will prepare a fix, asap.

Kanda commented 1 year ago

Hello @sandrohanea,

Thank you for addressing this issue with such promptness. I came across the updated nuget package that supposedly includes the fix as per the changelogs. However, upon using the package, I encountered the same issue where it still consumed 10gigs of memory. On the other hand, when I referenced the repository and debugged the code, it ran smoothly without any issues.

I have tried to identify the root cause of this anomaly but to no avail.

sandrohanea commented 1 year ago

Hello @Kanda , Just to confirm:

  1. You installed Whisper.net 1.4.4 and Whisper.net.Runtime 1.4.4 ?
  2. Have you tried to clean/rebuild after the upgrade?
Kanda commented 1 year ago

Hello @sandrohanea,

I have indeed updated both of them to 1.4.4 and have done both a clean/rebuild and a full delete of the bin and obj folder. To make sure to force a clean build.

I do know it has the new version cause the downloader now includes the QuantizationType. So I aint certain at this point why it isn't working with the nuget package.

sandrohanea commented 1 year ago

Sorry for the confusion, I decompiled the Whisper.net 1.4.4. and indeed it's not containing the fix: image

I don't know how I did that, but for some reason, the fix is not included in the build, but other fixes (even after that one, are present): image

Kanda commented 1 year ago

No problem at all things happen sometimes. Thank you for getting back so quick on this. It is surprising that fixes before and after did come through just fine

sandrohanea commented 1 year ago

I created the package from the branch, before merging: https://github.com/sandrohanea/whisper.net/pull/76 That branch was created before the one with the fix, and contains all the features after that fix....only this one got "lost". Will release again asap but have some additional things to add (in 1.4.5).

Kanda commented 1 year ago

There is no rush for my project currently so take your time

sandrohanea commented 1 year ago

Release 1.4.5 now, tested specifically this fix and it's working as expected using the nuget version.