GewoonJaap / WinWhisper

Create subtitles with ease, using Whisper AI for Windows
39 stars 7 forks source link

"Stream was too long" while processing video files larger than 2 GB #25

Open JohnstonJ opened 1 month ago

JohnstonJ commented 1 month ago

Describe the bug Attempting to process a large video immediately results in a crash: "IOException: Stream was too long".

To Reproduce Steps to reproduce the behavior:

  1. Process a large video file. Larger than 2 GB should be enough to trigger it.

Expected behavior WinWhisper should process the video without crashing.

Screenshots

Welcome to WinWhisper (1.3.2.0). Generate subtitles with ease using WhisperAI.
Enter the path where you want the subtitles to be saved...
Leave empty to save the subtitles in the ./Subtitles folder

Enter the video path or the folder path that contains the videos you want to process...
C:\Users\JOHNST~1\AppData\Local\Temp\mpout\Projects\myfile.mkv
In which language code (en,nl etc) is the audio for video: myfile.mkv? Leave empty to auto detect
en
Do you want to translate the subtitles to English? (yes/no) Default: no

Processing video: Johnston #4 - 2004 Forest Lake Academy, Big Bend.mkv
Extracting audio from video file located at: C:\Users\JOHNST~1\AppData\Local\Temp\mpout\Projects\myfile.mkv
This might take a while depending on the file size and drive speed...
An error occured. Please report the following on our GitHub page: https://github.com/GewoonJaap/WinWhisper/issues/new?assignees=&labels=&projects=&template=bug_report.md&title=
Error details:
========== Start Of Error ==========
Error name:
IOException
Error message:
Stream was too long.
Error stacktrace:
   at System.IO.MemoryStream.Write(Byte[] buffer, Int32 offset, Int32 count)
   at System.IO.Stream.CopyTo(Stream destination, Int32 bufferSize)
   at AudioExtractor.Extractor.ExtractAudioFromVideoFile(String videoFilePath)
   at Program.<>c__DisplayClass0_0.<<Main>b__1>d.MoveNext()
--- End of stack trace from previous location ---
   at Utility.LoopUtil.ForEachAsync[T](List`1 list, Func`2 func)
   at Program.Main(String[] args)
Error inner exception:

Error inner exception stacktrace:

========== End Of Error ==========
Press any key to exit...

Desktop (please complete the following information):

Additional context At https://github.com/GewoonJaap/WinWhisper/blob/5a77de230f93e5abab8b54e33f2a7dd0206ce895/AudioExtractor/Extractor.cs#L16C36-L16C48 it looks like you are copying the entire input file into a MemoryStream. MemoryStream has a documented limit of 2 GB, per https://learn.microsoft.com/en-us/dotnet/api/system.io.memorystream.setlength?view=net-8.0 - because that's the maximum length of an array in .NET.

Even if MemoryStream did not have this limit, copying the entire file into memory is probably still not a scalable approach, since the size of the video file might exceed physical memory.

GewoonJaap commented 1 month ago

Hi, thanks for reporting this bug. I will try and get this fixed :) I noticed however that Whisper expectes a stream with valid contents of a WAV file. So I will have to try and see if I could either use a different stream or somehow split the WAV file up into multiple parts.