Vaibhavs10 / insanely-fast-whisper

Apache License 2.0
7.49k stars 527 forks source link

Out Of memory & attempt to get argmin of an empty sequence #199

Open learner0333 opened 6 months ago

learner0333 commented 6 months ago

I am calling replicate model https://replicate.com/vaibhavs10/incredibly-fast-whisper using API. I am using it to get the transcript and diarization.

Initially I have used batch_size = 64 , The 1 Hour 13 minutes video worked fine.

I have tried another video that is 1 hour and 24 minutes. I got the Out of memory error. The exact error is as follows:

"Not enough memory available to process your request. Try reducing the size or number of any file inputs or outputs"

After this error I have tried to change the batch_size to 4,16,24 and 32 but I got the following error: "attempt to get argmin of an empty sequence"

will you please guide me to fix these two problems? I want to use Replicate in production so need to figure out these before implementing in the production environment.

I have tried two more videos that are 1 hour and 46 minutes and other is 2 hours and 7 minutes. I got the following error:

"unsupported operand type(s) for -: 'NoneType' and 'float'"

The following line is copied from the replicate log:

Whisper did not predict an ending timestamp, which can happen if audio is cut off in the middle of a word. Also make sure WhisperTimeStampLogitsProcessor was used during generation.

themire commented 6 months ago

I'm getting this too!

Iamam305 commented 5 months ago

facing the same issue

guppy57 commented 1 month ago

Same here

NickNaskida commented 3 weeks ago

Same :/ Can't get rid of "attempt to get argmin of an empty sequence" issue @Vaibhavs10 any chance to take a look?

NickNaskida commented 3 weeks ago

Guys, I noticed that not the latest version of the model was deployed on replicate, that's why not all features were supported and most of the bugs existed.

So I found the code for replicate deployed model and merged in the latest fixes from this repo. I also added support for num_speakers, min_speakers, and max_speakers for diarization on replicate.

Deployed Public Model on Replicate: https://replicate.com/nicknaskida/incredibly-fast-whisper Github Repo with fixes and latest features: https://github.com/NickNaskida/insanely-fast-whisper

Enjoy, hope it will be useful 🤗