Closed nielka98 closed 1 year ago
Do attach the file whisper_log.txt
... you probably need a gfx card with more memory.
Which model are you using? You could try a smaller one.
Do attach the file
whisper_log.txt
... you probably need a gfx card with more memory. Which model are you using? You could try a smaller one.
Could you please write it down because I don't understand what is written there
It's "written down" there. :) What is you GFX video card and how much video memory it has?
GFX
I have no idea, probably 8 GB and the vram is 6144 mb
Then who knows? :) Anyway, from your log I can see that it ran large model on some files without error, you can reduce memory usage by adjusting some parameters.
Try -bo=1
parameter, with it program will use a bit less memory, if you still get the error then use --temperature_increment_on_fallback=None
, these parameters can reduce accuracy a little bit.
I should use the commands you wrote for me in the program, right?
Press "Advanced" button and write commands there.
For example, like in the screenshot bellow:
ok, that's what I meant, but I think I did something wrong because I switched from GPU to CPU and I don't know how to reverse it
Make a screenshot of SE when you select "Audio to Text".
But with the command you told me, it probably worked because it continues to generate the text
Looks OK, if it used GPU before then it's using GPU now too.
Btw, you can force device with --device=cpu
or --device=cuda
parameters.
ok
When program is working you can press F2 and at the top you can see what device is in use.
ok
Let me use this thread to ask something about Whisper-Faster r160. I've been using r153 just fine and decided to update to the 160 version. But now it just doesn't do anything it gets stuck on "Starting transcription on..." I downloaded the 160 version directly from Purfview rep. I had to downgrade to 153 and it's working again. Thoughts?
@JDTR75 What --verbose
shows?
What argument should I use for --verbose
?
True
Here you go: whisper_log.txt
Did it program stopped by it self or you canceled it?
What if you use medium
model?
It stopped by itself. Same thing with medium model. whisper_log.txt
This works fine with r153 and large model, just fyi.
That's strange. Try to run it in console. And check what is reported in the Event Viewer.
I'm getting this from console now 🙃
Could not load library cudnn_cnn_infer64_8.dll. Error code 126 Please make sure cudnn_cnn_infer64_8.dll is in your library path!
And I have the libraries where they should be
Make a screenshot of the console and the folder with faster-whisper.
Console:
Faster-Whisper folder:
There is one dll missing.
You're right, zlibwapi.dll
was missing. It worked from console with medium
.
Another question, I'm running these tests on a separate SE instance - not the usual one I use. The only difference I can think of between the two SEs is the ffmpeg
each one has. Could this be the reason?
The new and clean instance where I'm running the tests has ffmpeg
6.0-essentials_build-www.gyan.dev
And the one I have on my usual SE instance is 5.1.2-full_build-www.gyan.dev
Why it was missing? Didn't SE downloaded those libs automatically?
Another question, I'm running these tests on a separate SE instance - not the usual one I use. The only difference I can think of between the two SEs is the ffmpeg each one has. Could this be the reason?
"The reason" of what?
Because I copied them from the ones I already have in my other instance.
The reason for not running. Could it be the different versions of ffmpeg
?
I was further testing with the clean setup and r160
still can't run the large
model on this audio, but the the r153
did. Here's what's happening, just fyi. And I can happily live with r153
- just curious as to why this is happening.
https://github.com/SubtitleEdit/subtitleedit/assets/1891941/06964778-0c8c-4e4b-87bd-0117b1dc491f
ffmpeg version can have influence -> https://github.com/SubtitleEdit/subtitleedit/issues/6946
And there is different compute type by default in r153 & r160 versions.
In r153 default is int8_float32
, in r160 it's auto
which on your gfx card defaults to int8_float16
.
You can try float16
and check which type is fastest for you. [use --temperature_increment_on_fallback=None
in benchmarks]
Btw, you are not the first who reported that they have issues with values over the thresholds with int8_float16
.
Yes, I read that thread and that's why I asked you about ffmpeg
. Will try with --compute_type
now.
Having said this, would you recommend sticking to ffmpeg
5 or 6?
Imo, stick to 5.
@JDTR75 Could you share that audio where you had the issue with values over the thresholds with int8_float16?
Hey @Purfview ! Unfortunately, I can't - confidentiality reasons
Could you then make few tests on that file?
Sure!
Got it. You want me to test it with int8_float16 exclusively?
--compute_type=auto --verbose=true
.
Here you go. Both worked fine. I removed the actual transcription on each one.
Thanks, could you make same "txt" with old libs?
Looks like some bug in older libs, I'll upload updated libs. Thx, for tests.
Btw, all (3) users who reported this issue are with "GTX 1650". :)
You're welcome! Thank you for double checking. I have a qq. Is there a way (via command line or sth) to get the voices/speakers "marked"? I mean, like having a chevron >> or a hyphen - for when it detects the different voices/speakers.
Not in mine project.
Got it. Thx for confirming
How fix this :RuntimeError: CUDA failed with error out of memory