Installed this to use Existing WhisperX installation Conda (that uses my Nvidia GPU) and chose GPU option, but it only uses CPU

cleverestx commented 7 months ago

The Gui won't let me switch to CPU. Did I miss something? The instance of WhipserX from the command line IS using my Nvidia GPU.

I'm using Windows 11.

Pikurrot commented 7 months ago

Hi, if you confirmed to use GPU when asked by the program, then this behavior is not expected. However, you should be able to solve it in the following way: check your config.json file inside the configs folder and make sure there is actually a field named "gpu_support" and is set to true, like this:

{
    "env_name": "<YOUR_ENVIRONMENT_NAME>",
    "gpu_support": true,
    "auto_update": true
}

Rerun the GUI and you should be able to select the "cuda" option for GPU.

Note that this will work if your environment was set up correctly and Pytorch for GPU is installed. So, if that didn't work, I suggest you select to create a new environment, as it's the recommended way. To do that, first delete the config.json file, then run the whisper-gui.bat file as if it was the first time.

Let me know if you have more trouble and, if so, attach the logs that appear in the terminal when you set up the program for the first time, so that I can know exactly when the bug appears.

I hope this helps.

cleverestx commented 7 months ago

Thank you for the response, but strangely enough I don't appear to have a config.json file in that folder or anywhere else...I only have these two files in the config folder:

Because I lack it entirely as if I "deleted" it (as per your instructions above), I ran the whisper-gui.bat

and got this error/result:

??

Pikurrot commented 7 months ago

Seems an error when trying to install additional dependencies. I will look into it and try to recreate the error.

cleverestx commented 7 months ago

Thanks. Let me know if you need anything else from me.

Pikurrot commented 7 months ago

Hi, I could replicate your error on Windows, should be solved now. I suggest you delete your config.json file if you had one, and run the whisper-gui.bat again selecting to create a new environment.

However, I realized some errors may raise due to the latest releases of some packages:

RuntimeError: Unsupported model binary version: this is easy to fix, just delete the downloaded whisper model (inside models/whisperx/ and let the GUI install the latest version of that model.
RuntimeError: Library cublas64_12.dll is not found or cannot be loaded: Unfortunately, this error seems to be a problem with the latest major release of CTranslate2, which allowed support for cuda 12 but now it seems to have problems with cuda 11. I hope they fix it soon, I will stay tunned.

For now, this last error only happens on Windows for me, I had no problems running the program on Linux. Maybe you could find a workaround with WSL.

cleverestx commented 7 months ago

Thanks for the details! I guess I'm stuck waiting for CTranslate2 to be fixed.... :-(

I can try WSL...I have Ubuntu installed, but I have WhisperX already working via command line in Windows. I guess I can reinstall everything there for now.

Same issue with my Ubuntu WSL installation....odd, I guess I'll just wait.

...also I don't have a models\whisperx\folder to delete a model in the Windows version (to fix the first thing you listed) but I'll re-run the command anyways, so I guess I'm just waiting in either case.

Pikurrot commented 7 months ago

I've posted an issue in the CTranslate2 repo: OpenNMT/CTranslate2#1630 Let's hope they fix it soon :)

cleverestx commented 7 months ago

I saw this elsewhere.....if it helps...

SmartSelect_20240227_125709_Chrome

Pikurrot commented 7 months ago

Yeah, so basically, there appear to be 2 solutions:

CTranslate2 only supports cuda 12.x now. So install the latest version of Nvidia CUDA. And that will already make it work, even with pytorch-cuda=11.8.
Rename the file cublas64_11.dll to cublas64_12.dll in "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\bin". You can literally just do that and there is no need to install any newer cuda nor pytorch-cuda. However, I think this is a temporal fix and won't be useful in a future, but for now it's an option. Also, I'm not aware if this has other undesired consequences.

Let me know if that solved your problem.

cleverestx commented 7 months ago

I think I've already tried number 1, and it still had a problem...but I'm gonna double check that. I will let you know!

cleverestx commented 7 months ago

Hmm, well I just ran whisper-gui.bat and it asked me to update, and enable auto-updates and then started up. I can select CUDA now, so I guess it's all working now! :-)

One more question..how do I set --task translate to work via this GUI for a given video/audio file? I use this command flag often for most of the videos I process through Whisper-X.

Thank you for all the help and info.

Pikurrot commented 7 months ago

I'm glad it's working for you now! About --task translate, it's still not an available option in the GUI, but I will take it into account for a future update, thanks for pointing it out!

I'm closing this issue as it's been solved. Feel free to open a new issue for a new suggestion for this project or if you encounter any more bugs. Thanks.

chrisangel666 commented 3 months ago

Hello, your code is very good. I would like to ask how to add localization options. I want to translate the interface into Chinese.

chrisangel666 commented 3 months ago

Also, when the subtitles and audio files are forced to align, if the wav2vec2 model of the corresponding language is selected for processing, will the effect be better than the current general model?

chrisangel666 commented 3 months ago

Every time I run the .bat file, I get this message: "torchvision is not available - cannot save figures". What is going on? Is there anything I need to do?

Pikurrot commented 3 months ago

Hello, your code is very good. I would like to ask how to add localization options. I want to translate the interface into Chinese.

Hi, thank you! I also wanted to add an option to change the language. We can work on that together, if you want, just tell me and we open a discussion page in this repo and talk about it.

Also, when the subtitles and audio files are forced to align, if the wav2vec2 model of the corresponding language is selected for processing, will the effect be better than the current general model?

Yes, I would say the transcription should be more accurate.

Every time I run the .bat file, I get this message: "torchvision is not available - cannot save figures". What is going on? Is there anything I need to do?

This is fine, not a problem. Just a warning you can avoid by installing torchvision, but you can safely ignore it.

chrisangel666 commented 3 months ago

Nice. How to add the corresponding wav2vec2 model to the options? Does it mean to modify the corresponding item in the code? If so, how?

Pikurrot commented 3 months ago

Wav2vec2 is an alternative, different model to Whisper. If we were to use it, we would need to add an option to choose between both, and modify some functions like _transcribe() and add a function transcribe_wav2vec2().

Are you interested in contributing?

chrisangel666 commented 3 months ago

How?

Pikurrot commented 3 months ago

To contribute, you can fork this repo, make your own changes and then merge it with mine via pull request. Or we can simply discuss ideas in a discussion page.