SubtitleEdit / subtitleedit

the subtitle editor :)
http://www.nikse.dk/SubtitleEdit/Help
GNU General Public License v3.0
8.89k stars 915 forks source link

Extra parameters for Whisper Command Line for OpenAI #8907

Closed rdr44b closed 1 month ago

rdr44b commented 1 month ago

Hi, guys,

I finally got OpenAI Whisper and WhisperX to work within Subtitle Edit. Upon first look, I believe it show a little better accuracy when compared to Purfview Faster Whisper. However, I can't seem to change the pytorch device from CPU to GPU by using the Whisper Command Line and the cpu is definitely taking much too much time.

In the Whisper Audio-to-text menu, under Advanced button, there is a command line argument to change from CPU to GPU...

--device DEVICE device to use for PyTorch inference (default : cpu)

However, I've tried --device GPU, --device GPU0, --device gpu, --device gpu0, --device CUDA and none of the them works. It always give me an error message saying "No text found!, Note that you have custome argument; --device XXX"

Can anyone help.

Clipboard_10-13-2024_01

niksedk commented 1 month ago

--device GPU should be fine...

Check your whisper_log.txt file... perhaps you're out of memory.?

rdr44b commented 1 month ago

Thanks, Niksedk,

It didn't work.

However, I found an issue with my Windows 11. I have installed CUDA 12.4 properly which can be found in my command prompt.

However, when I tried to find my CUDA availability in Pytotch, it says CUDA is not available.

Any idea what I did wrong during my installation? Do I need to install CUDA again within Torch in addition to Windows 11?

niksedk commented 1 month ago

Again check your whisper_log.txt file! It's here you will hopefully learn what's wrong...

Do try "Purfview faster Whiserp" or "Whisper CPP".

rdr44b commented 1 month ago

I looked into the log file and couldn't identify any particular problem. Please see attached txt if you are interested.

However, I found a way around it. I forced the default device of Pytorch to be GPU instead of using the openAI Whisper command line argument and it seemed to be working now.

Python : import torch if torch.cuda.is_available(): torch.set_default_device('cuda')

Also I find Purfview Faster Whisper still works best in terms of speed and accuracy. However, I am just trying to experiment with various transcribing tools.

Thanks again. And I really appreciate your effort.

whisper_log.txt

niksedk commented 1 month ago

@rdr44b: It's possible the command for GPU/CUDA line should be --device cuda - but I only have an Intel GPU... Could also have something to do with CUDA 11 vs 12. Python dependencies are really hard!

Also, latest SE beta now uses Whisper CPP 1.7.1 : https://github.com/SubtitleEdit/subtitleedit/releases/download/4.0.8/SubtitleEditBeta.zip

I mostly use Purfview's Faster Whisper, as it has best timestamps IMO.

Nice you got WhisperX working... I've given up on than in later versions.

rdr44b commented 1 month ago

--device cuda & CUDA didn't work.

FYI I didn't get Whisperx & Ctranslate to work within Subtitle Edit. They both have the same issue of "text not found." But it is a known issue :

https://github.com/m-bain/whisperX/issues/285#issuecomment-1569049792

At this point, Purfview's Faster Whisper is still the king for Subtitle Edit transcriber no matter how you slice and dice. I managed to get OpenAI Whisper to work after spending nearly a day of installing and configuration Python/Pytorch/CUDA with nearly 50G of additional disk memory and still get poorer speed and accuracy.

Oh well, I just had to try it to see for myself.