Pikurrot / whisper-gui

A simple GUI to use Whisper.
MIT License
89 stars 6 forks source link

Installed this to use Existing WhisperX installation Conda (that uses my Nvidia GPU) and chose GPU option, but it only uses CPU #15

Closed cleverestx closed 2 months ago

cleverestx commented 7 months ago

The Gui won't let me switch to CPU. Did I miss something? The instance of WhipserX from the command line IS using my Nvidia GPU.

I'm using Windows 11.

image

Pikurrot commented 7 months ago

Hi, if you confirmed to use GPU when asked by the program, then this behavior is not expected. However, you should be able to solve it in the following way: check your config.json file inside the configs folder and make sure there is actually a field named "gpu_support" and is set to true, like this:

{
    "env_name": "<YOUR_ENVIRONMENT_NAME>",
    "gpu_support": true,
    "auto_update": true
}

Rerun the GUI and you should be able to select the "cuda" option for GPU.

Note that this will work if your environment was set up correctly and Pytorch for GPU is installed. So, if that didn't work, I suggest you select to create a new environment, as it's the recommended way. To do that, first delete the config.json file, then run the whisper-gui.bat file as if it was the first time.

Let me know if you have more trouble and, if so, attach the logs that appear in the terminal when you set up the program for the first time, so that I can know exactly when the bug appears.

I hope this helps.

cleverestx commented 7 months ago

Thank you for the response, but strangely enough I don't appear to have a config.json file in that folder or anywhere else...I only have these two files in the config folder:

image

Because I lack it entirely as if I "deleted" it (as per your instructions above), I ran the whisper-gui.bat

and got this error/result:

image

??

Pikurrot commented 7 months ago

Seems an error when trying to install additional dependencies. I will look into it and try to recreate the error.

cleverestx commented 7 months ago

Thanks. Let me know if you need anything else from me.

Pikurrot commented 7 months ago

Hi, I could replicate your error on Windows, should be solved now. I suggest you delete your config.json file if you had one, and run the whisper-gui.bat again selecting to create a new environment.

However, I realized some errors may raise due to the latest releases of some packages:

For now, this last error only happens on Windows for me, I had no problems running the program on Linux. Maybe you could find a workaround with WSL.

cleverestx commented 7 months ago

Thanks for the details! I guess I'm stuck waiting for CTranslate2 to be fixed.... :-(

I can try WSL...I have Ubuntu installed, but I have WhisperX already working via command line in Windows. I guess I can reinstall everything there for now.

Same issue with my Ubuntu WSL installation....odd, I guess I'll just wait.

...also I don't have a models\whisperx\folder to delete a model in the Windows version (to fix the first thing you listed) but I'll re-run the command anyways, so I guess I'm just waiting in either case.

Pikurrot commented 7 months ago

I've posted an issue in the CTranslate2 repo: OpenNMT/CTranslate2#1630 Let's hope they fix it soon :)

cleverestx commented 7 months ago

I saw this elsewhere.....if it helps...

SmartSelect_20240227_125709_Chrome

Pikurrot commented 7 months ago

Yeah, so basically, there appear to be 2 solutions:

  1. CTranslate2 only supports cuda 12.x now. So install the latest version of Nvidia CUDA. And that will already make it work, even with pytorch-cuda=11.8.
  2. Rename the file cublas64_11.dll to cublas64_12.dll in "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\bin". You can literally just do that and there is no need to install any newer cuda nor pytorch-cuda. However, I think this is a temporal fix and won't be useful in a future, but for now it's an option. Also, I'm not aware if this has other undesired consequences.

Let me know if that solved your problem.

cleverestx commented 7 months ago

I think I've already tried number 1, and it still had a problem...but I'm gonna double check that. I will let you know!

cleverestx commented 7 months ago

Hmm, well I just ran whisper-gui.bat and it asked me to update, and enable auto-updates and then started up. I can select CUDA now, so I guess it's all working now! :-)

One more question..how do I set --task translate to work via this GUI for a given video/audio file? I use this command flag often for most of the videos I process through Whisper-X.

Thank you for all the help and info.

Pikurrot commented 7 months ago

I'm glad it's working for you now! About --task translate, it's still not an available option in the GUI, but I will take it into account for a future update, thanks for pointing it out!

I'm closing this issue as it's been solved. Feel free to open a new issue for a new suggestion for this project or if you encounter any more bugs. Thanks.

chrisangel666 commented 3 months ago

Hello, your code is very good. I would like to ask how to add localization options. I want to translate the interface into Chinese.

chrisangel666 commented 3 months ago

Also, when the subtitles and audio files are forced to align, if the wav2vec2 model of the corresponding language is selected for processing, will the effect be better than the current general model?

chrisangel666 commented 3 months ago

Every time I run the .bat file, I get this message: "torchvision is not available - cannot save figures". What is going on? Is there anything I need to do?

Pikurrot commented 3 months ago

Hello, your code is very good. I would like to ask how to add localization options. I want to translate the interface into Chinese.

Hi, thank you! I also wanted to add an option to change the language. We can work on that together, if you want, just tell me and we open a discussion page in this repo and talk about it.

Also, when the subtitles and audio files are forced to align, if the wav2vec2 model of the corresponding language is selected for processing, will the effect be better than the current general model?

Yes, I would say the transcription should be more accurate.

Every time I run the .bat file, I get this message: "torchvision is not available - cannot save figures". What is going on? Is there anything I need to do?

This is fine, not a problem. Just a warning you can avoid by installing torchvision, but you can safely ignore it.

chrisangel666 commented 3 months ago

Nice. How to add the corresponding wav2vec2 model to the options? Does it mean to modify the corresponding item in the code? If so, how?

Pikurrot commented 3 months ago

Wav2vec2 is an alternative, different model to Whisper. If we were to use it, we would need to add an option to choose between both, and modify some functions like _transcribe() and add a function transcribe_wav2vec2().

Are you interested in contributing?

chrisangel666 commented 3 months ago

How?

Pikurrot commented 3 months ago

To contribute, you can fork this repo, make your own changes and then merge it with mine via pull request. Or we can simply discuss ideas in a discussion page.

chrisangel666 commented 2 months ago

Hi, master. I have seen that you updated the whisper-gui code. How do we update it and what command is used?

Pikurrot commented 2 months ago

The software should be updated automatically when you run whisper-gui.bat. With the latest update, if you see a new tab in the interface called "Settings", then it's updated

chrisangel666 commented 2 months ago

Every time I turn it on, it says it can't be updated. What's going on? 提示 Can I update it manually using commands?

Pikurrot commented 2 months ago

The command to update it is git pull origin master. The script is trying to run git fetch to look for updates, but for some reason it seems to fail for you. Try manually and tell me if you can.

chrisangel666 commented 2 months ago

OK, I will try

chrisangel666 commented 2 months ago

try It doesn't work

Pikurrot commented 2 months ago

ok, you need to go first to the path where your whisper-gui repository is. You can do it with cd path\to\your\repository or going directly to the folder where you have it, right click and open the Terminal there. Once you see that the path appears as C:\....\whisper-gui, then you can run git pull, or directly run the whisper-gui.bat, now it should work.

chrisangel666 commented 2 months ago

git pull https://github.com/Pikurrot/whisper-gui Should I go to the program root directory and run this command using cmd?

Pikurrot commented 2 months ago

The URL is not necessary, but yes, from the directory where your program is

chrisangel666 commented 2 months ago

try-2 It doesn't work

Pikurrot commented 2 months ago

Your repository seems to have the wrong name. I don't know how you have it organized, so I suggest you clone the repo again with git clone https://github.com/Pikurrot/whisper-gui.git in the directory you want. Then it should work fine. If you do this, bring your config.json file from the old to the new repository by manually copy-paste. Also, if you had any saved outputs or downloaded models, bring them too (they are in folders outputs and models). This is the easiest way I think, as I don't know your problems

chrisangel666 commented 2 months ago

OK, I'll try

chrisangel666 commented 2 months ago

try-3

chrisangel666 commented 2 months ago

I don't find a "Settings" option in the new web interface

Pikurrot commented 2 months ago

image You should see it there. Make sure you executed the program of the repository you just cloned, and not another one. You cloned the repo in its latest version, and you are able to execute it. I don't know what's the issue you are having then. Sorry I can't be of much help :(

chrisangel666 commented 2 months ago

Okay, after I delete all the files in the original folder, I run this command again and it succeed. Thank you.

Pikurrot commented 2 months ago

Nice, glad you found the way! I'm closing this issue now, open a new one if you ever have a different problem. Thank you! :smiley:

chrisangel666 commented 2 months ago

It would be perfect if I could select the alignment model for the corresponding language of the audio content.

chrisangel666 commented 2 months ago

Wav2vec2

Pikurrot commented 2 months ago

An alignment model of the same language of the transcription is already loaded automatically. It is also the language you select in "Advanced Options". If you mean an option to use a custom alignment model, like Wav2vec2, I will take note for a future update.

chrisangel666 commented 2 months ago

Nice, can't wait to try it.

chrisangel666 commented 2 months ago

a custom alignment model, like Wav2vec2, that is pretty good.