Closed cleverestx closed 2 months ago
Hi,
if you confirmed to use GPU when asked by the program, then this behavior is not expected. However, you should be able to solve it in the following way: check your config.json
file inside the configs
folder and make sure there is actually a field named "gpu_support"
and is set to true
, like this:
{
"env_name": "<YOUR_ENVIRONMENT_NAME>",
"gpu_support": true,
"auto_update": true
}
Rerun the GUI and you should be able to select the "cuda" option for GPU.
Note that this will work if your environment was set up correctly and Pytorch for GPU is installed. So, if that didn't work, I suggest you select to create a new environment, as it's the recommended way. To do that, first delete the config.json
file, then run the whisper-gui.bat
file as if it was the first time.
Let me know if you have more trouble and, if so, attach the logs that appear in the terminal when you set up the program for the first time, so that I can know exactly when the bug appears.
I hope this helps.
Thank you for the response, but strangely enough I don't appear to have a config.json file in that folder or anywhere else...I only have these two files in the config folder:
Because I lack it entirely as if I "deleted" it (as per your instructions above), I ran the whisper-gui.bat
and got this error/result:
??
Seems an error when trying to install additional dependencies. I will look into it and try to recreate the error.
Thanks. Let me know if you need anything else from me.
Hi, I could replicate your error on Windows, should be solved now. I suggest you delete your config.json
file if you had one, and run the whisper-gui.bat
again selecting to create a new environment.
However, I realized some errors may raise due to the latest releases of some packages:
RuntimeError: Unsupported model binary version
: this is easy to fix, just delete the downloaded whisper model (inside models/whisperx/
and let the GUI install the latest version of that model.RuntimeError: Library cublas64_12.dll is not found or cannot be loaded
: Unfortunately, this error seems to be a problem with the latest major release of CTranslate2, which allowed support for cuda 12 but now it seems to have problems with cuda 11. I hope they fix it soon, I will stay tunned.For now, this last error only happens on Windows for me, I had no problems running the program on Linux. Maybe you could find a workaround with WSL.
Thanks for the details! I guess I'm stuck waiting for CTranslate2 to be fixed.... :-(
I can try WSL...I have Ubuntu installed, but I have WhisperX already working via command line in Windows. I guess I can reinstall everything there for now.
Same issue with my Ubuntu WSL installation....odd, I guess I'll just wait.
...also I don't have a models\whisperx\folder to delete a model in the Windows version (to fix the first thing you listed) but I'll re-run the command anyways, so I guess I'm just waiting in either case.
I've posted an issue in the CTranslate2 repo: OpenNMT/CTranslate2#1630 Let's hope they fix it soon :)
I saw this elsewhere.....if it helps...
Yeah, so basically, there appear to be 2 solutions:
pytorch-cuda=11.8
.cublas64_11.dll
to cublas64_12.dll
in "C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8\bin". You can literally just do that and there is no need to install any newer cuda nor pytorch-cuda. However, I think this is a temporal fix and won't be useful in a future, but for now it's an option. Also, I'm not aware if this has other undesired consequences.Let me know if that solved your problem.
I think I've already tried number 1, and it still had a problem...but I'm gonna double check that. I will let you know!
Hmm, well I just ran whisper-gui.bat and it asked me to update, and enable auto-updates and then started up. I can select CUDA now, so I guess it's all working now! :-)
One more question..how do I set --task translate to work via this GUI for a given video/audio file? I use this command flag often for most of the videos I process through Whisper-X.
Thank you for all the help and info.
I'm glad it's working for you now!
About --task translate
, it's still not an available option in the GUI, but I will take it into account for a future update, thanks for pointing it out!
I'm closing this issue as it's been solved. Feel free to open a new issue for a new suggestion for this project or if you encounter any more bugs. Thanks.
Hello, your code is very good. I would like to ask how to add localization options. I want to translate the interface into Chinese.
Also, when the subtitles and audio files are forced to align, if the wav2vec2 model of the corresponding language is selected for processing, will the effect be better than the current general model?
Every time I run the .bat file, I get this message: "torchvision is not available - cannot save figures". What is going on? Is there anything I need to do?
Hello, your code is very good. I would like to ask how to add localization options. I want to translate the interface into Chinese.
Hi, thank you! I also wanted to add an option to change the language. We can work on that together, if you want, just tell me and we open a discussion page in this repo and talk about it.
Also, when the subtitles and audio files are forced to align, if the wav2vec2 model of the corresponding language is selected for processing, will the effect be better than the current general model?
Yes, I would say the transcription should be more accurate.
Every time I run the .bat file, I get this message: "torchvision is not available - cannot save figures". What is going on? Is there anything I need to do?
This is fine, not a problem. Just a warning you can avoid by installing torchvision, but you can safely ignore it.
Nice. How to add the corresponding wav2vec2 model to the options? Does it mean to modify the corresponding item in the code? If so, how?
Wav2vec2 is an alternative, different model to Whisper. If we were to use it, we would need to add an option to choose between both, and modify some functions like _transcribe() and add a function transcribe_wav2vec2().
Are you interested in contributing?
How?
To contribute, you can fork this repo, make your own changes and then merge it with mine via pull request. Or we can simply discuss ideas in a discussion page.
Hi, master. I have seen that you updated the whisper-gui code. How do we update it and what command is used?
The software should be updated automatically when you run whisper-gui.bat
. With the latest update, if you see a new tab in the interface called "Settings", then it's updated
Every time I turn it on, it says it can't be updated. What's going on? Can I update it manually using commands?
The command to update it is git pull origin master
.
The script is trying to run git fetch
to look for updates, but for some reason it seems to fail for you. Try manually and tell me if you can.
OK, I will try
It doesn't work
ok, you need to go first to the path where your whisper-gui repository is. You can do it with cd path\to\your\repository
or going directly to the folder where you have it, right click and open the Terminal there.
Once you see that the path appears as C:\....\whisper-gui
, then you can run git pull
, or directly run the whisper-gui.bat
, now it should work.
git pull https://github.com/Pikurrot/whisper-gui Should I go to the program root directory and run this command using cmd?
The URL is not necessary, but yes, from the directory where your program is
It doesn't work
Your repository seems to have the wrong name. I don't know how you have it organized, so I suggest you clone the repo again with git clone https://github.com/Pikurrot/whisper-gui.git
in the directory you want. Then it should work fine.
If you do this, bring your config.json
file from the old to the new repository by manually copy-paste. Also, if you had any saved outputs or downloaded models, bring them too (they are in folders outputs
and models
).
This is the easiest way I think, as I don't know your problems
OK, I'll try
I don't find a "Settings" option in the new web interface
You should see it there. Make sure you executed the program of the repository you just cloned, and not another one. You cloned the repo in its latest version, and you are able to execute it. I don't know what's the issue you are having then. Sorry I can't be of much help :(
Okay, after I delete all the files in the original folder, I run this command again and it succeed. Thank you.
Nice, glad you found the way! I'm closing this issue now, open a new one if you ever have a different problem. Thank you! :smiley:
It would be perfect if I could select the alignment model for the corresponding language of the audio content.
Wav2vec2
An alignment model of the same language of the transcription is already loaded automatically. It is also the language you select in "Advanced Options". If you mean an option to use a custom alignment model, like Wav2vec2, I will take note for a future update.
Nice, can't wait to try it.
a custom alignment model, like Wav2vec2, that is pretty good.
The Gui won't let me switch to CPU. Did I miss something? The instance of WhipserX from the command line IS using my Nvidia GPU.
I'm using Windows 11.