abdeladim-s / subsai

🎞️ Subtitles generation tool (Web-UI + CLI + Python package) powered by OpenAI's Whisper and its variants 🎞️
https://abdeladim-s.github.io/subsai/
GNU General Public License v3.0
1.15k stars 96 forks source link

ROCm isnt supported #67

Open MidnightKittenCat opened 10 months ago

MidnightKittenCat commented 10 months ago

Is it possible to add ROCm support for amd gpus?

abdeladim-s commented 10 months ago

I think Pytorch supports ROCm, so it should be supported out of the box basically, but I didn't test it so I couldn't tell. You can give it a try, and let me know how it goes :

  1. create a new virtual environment.
  2. install Pytorch ROCm compatible
    pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.4.2
  3. Then install subsai (as described in the README)
MidnightKittenCat commented 10 months ago

Doesn't seem like it, complains about cuda even if I've used the rocm one.

Seems to be hardcoded and there isnt a rocm version for the device type.

MidnightKittenCat commented 10 months ago

Pytorch does support rocm but this just complains about cuda libraries

MidnightKittenCat commented 10 months ago

RuntimeError: CUDA failed with error CUDA driver version is insufficient for CUDA runtime version

abdeladim-s commented 10 months ago

Doesn't seem like it, complains about cuda even if I've used the rocm one.

Seems to be hardcoded and there isnt a rocm version for the device type.

AFAIK, Pytorch is using cuda device name even for ROCm devices, so this is not a problem.

abdeladim-s commented 10 months ago

Pytorch does support rocm but this just complains about cuda libraries

if it complains about cuda libraries, then maybe your GPU chip is not supported by Pytorch. Please make sure first that your GPU is supported from the ROCm docs, then run some tests so you can know if it is working well or not!

MidnightKittenCat commented 10 months ago

It is indeed supported by rocm, like I said it's not pytorch that's complaining about it.

MidnightKittenCat commented 10 months ago

I've tried a conda env that didn't work, tried doing it globally also didn't work.

My GPU Chip is supported because I can use it on S.D. Next (Stable diffusion webui) just fine

MidnightKittenCat commented 10 months ago

Could you attempt to add ROCm to the device list and I can test it out for you or?

abdeladim-s commented 10 months ago

Weird! cause it should be working with the same device name of cuda as well! See this comment here as well. So you think this is bug in the package ? Do you have any idea where this is coming from ? what model are you trying to use ? what is the full error you are getting ?

MidnightKittenCat commented 10 months ago

I don’t think it’s an issue with PyTorch itself, it seems to me that the cuda device could be hardcoding cuda or something similar, as PyTorch with the same version on rocm works completely fine for me on other projects.

The full error I was getting was mainly what I’ve sent you, however I’ll go ahead and get you it.

Perhaps take a look at the code and see if cuda is hardcoded.

abdeladim-s commented 10 months ago

Yes cuda is hardcoded if torch.cuda.is_available() is True. To what a value should I change it so that ROCm device should be supported ?

MidnightKittenCat commented 10 months ago

I believe I've found the issue, "EXPORT HSA_OVERRIDE_GFX_VERSION=10.3.0" seems to of fixed the issue for me, so for future cases I'd redirect people to this fix and see if that works for them.

Unfortunately it's just AMD making it hard to identify their own hardware.

Sorry for the troubles <3

MidnightKittenCat commented 10 months ago

I would however like a feature that embeds the subtitles into the video, allowing us to download it afterwards (hopefully to get around the 200mb limit)

Or even just making a flag to disable it. Unless there is one?

abdeladim-s commented 10 months ago

I believe I've found the issue, "EXPORT HSA_OVERRIDE_GFX_VERSION=10.3.0" seems to of fixed the issue for me, so for future cases I'd redirect people to this fix and see if that works for them.

Unfortunately it's just AMD making it hard to identify their own hardware.

Sorry for the troubles <3

Great to hear that you found the source of the issue and thanks for posting the solution, It will certainly help other AMD users facing similar issues. I will add a reference to this issue in the README file as well.

abdeladim-s commented 10 months ago

I would however like a feature that embeds the subtitles into the video, allowing us to download it afterwards (hopefully to get around the 200mb limit)

Or even just making a flag to disable it. Unless there is one?

You mean to merge the video with the subtitles directly ? without exporting just the srt file ?

abdeladim-s commented 10 months ago

Yeah the 200mb limit is imposed by the video component of streamlit, I tried to look for alternatives that support subtitles but unfortunaltey I couldn't find any! I will see if I there is any other solution. you can always use the CLI, it is easy and there is no limit!

MidnightKittenCat commented 10 months ago

Sounds great! Also is it possible to have multiple files in a queue?

abdeladim-s commented 10 months ago

Yes using the CLI, you can provide a text file containing the absolute path of the files, it will run them one by one,

MidnightKittenCat commented 10 months ago

A little update on this, seems only "whisper" by OpenAI and "whisper timestamped" detects rocm (cuda:0) the rest do not

abdeladim-s commented 10 months ago

I've re-checked the device attribute of all models, I have fixed WhsiperX and hopefully it should be working now, please give it a try! For fatser-whisper, I am not sure if supports ROCm, because the implementation is in C++ and they are not using Pytorch AFAIK.

MidnightKittenCat commented 10 months ago

For WhisperX, I'm getting this error:

ValueError: unsupported device cuda:0 Traceback: File "/home/midnight/miniconda3/envs/caption/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 565, in _run_script exec(code, module.__dict__) File "/home/midnight/miniconda3/envs/caption/lib/python3.10/site-packages/subsai/webui.py", line 523, in <module> run() File "/home/midnight/miniconda3/envs/caption/lib/python3.10/site-packages/subsai/webui.py", line 516, in run webui() File "/home/midnight/miniconda3/envs/caption/lib/python3.10/site-packages/subsai/webui.py", line 316, in webui subs = _transcribe(file_path, stt_model_name, model_config) File "/home/midnight/miniconda3/envs/caption/lib/python3.10/site-packages/streamlit/runtime/caching/cache_utils.py", line 194, in wrapper return cached_func(*args, **kwargs) File "/home/midnight/miniconda3/envs/caption/lib/python3.10/site-packages/streamlit/runtime/caching/cache_utils.py", line 223, in __call__ return self._get_or_create_cached_value(args, kwargs) File "/home/midnight/miniconda3/envs/caption/lib/python3.10/site-packages/streamlit/runtime/caching/cache_utils.py", line 248, in _get_or_create_cached_value return self._handle_cache_miss(cache, value_key, func_args, func_kwargs) File "/home/midnight/miniconda3/envs/caption/lib/python3.10/site-packages/streamlit/runtime/caching/cache_utils.py", line 302, in _handle_cache_miss computed_value = self._info.func(*func_args, **func_kwargs) File "/home/midnight/miniconda3/envs/caption/lib/python3.10/site-packages/subsai/webui.py", line 189, in _transcribe model = subs_ai.create_model(model_name, model_config=model_config) File "/home/midnight/miniconda3/envs/caption/lib/python3.10/site-packages/subsai/main.py", line 95, in create_model return AVAILABLE_MODELS[model_name]['class'](model_config) File "/home/midnight/miniconda3/envs/caption/lib/python3.10/site-packages/subsai/models/whisperX_model.py", line 123, in __init__ self.model = whisperx.load_model(self.model_type, File "/home/midnight/miniconda3/envs/caption/lib/python3.10/site-packages/whisperx/asr.py", line 50, in load_model model = WhisperModel(whisper_arch, File "/home/midnight/miniconda3/envs/caption/lib/python3.10/site-packages/faster_whisper/transcribe.py", line 120, in __init__ self.model = ctranslate2.models.Whisper(

However it does say cuda:0 (which usually indicates that the gpu is detected) so something is wrong here.

abdeladim-s commented 10 months ago

Oh Yeah, WhisperX is using Faster-whisper as its backend! so I doubt if it will work in your case!

MidnightKittenCat commented 10 months ago

Ah that’s very unfortunate, thank you for trying though!

I would however like a feature that embeds the subtitles into the video, allowing us to download it afterwards (hopefully to get around the 200mb limit) Or even just making a flag to disable it. Unless there is one?

You mean to merge the video with the subtitles directly ? without exporting just the srt file ?

And yes, I do mean this.

abdeladim-s commented 10 months ago

Ah that’s very unfortunate, thank you for trying though!

I would however like a feature that embeds the subtitles into the video, allowing us to download it afterwards (hopefully to get around the 200mb limit) Or even just making a flag to disable it. Unless there is one?

You mean to merge the video with the subtitles directly ? without exporting just the srt file ?

And yes, I do mean this.

Ok, I've added this feature. If you are using the webui, you can find in the export section. Please give it a try and let me know if you find any issues ?

MidnightKittenCat commented 9 months ago

Ah that’s very unfortunate, thank you for trying though!

I would however like a feature that embeds the subtitles into the video, allowing us to download it afterwards (hopefully to get around the 200mb limit) Or even just making a flag to disable it. Unless there is one?

You mean to merge the video with the subtitles directly ? without exporting just the srt file ?

And yes, I do mean this.

Ok, I've added this feature. If you are using the webui, you can find in the export section. Please give it a try and let me know if you find any issues ?

Very sorry I was very busy.

I'm now getting this error:

Error: ffmpeg error (see stderr output for detail) Traceback: File "/home/midnight/miniconda3/envs/caption/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 565, in _run_script exec(code, module.__dict__) File "/home/midnight/miniconda3/envs/caption/lib/python3.10/site-packages/subsai/webui.py", line 535, in <module> run() File "/home/midnight/miniconda3/envs/caption/lib/python3.10/site-packages/subsai/webui.py", line 528, in run webui() File "/home/midnight/miniconda3/envs/caption/lib/python3.10/site-packages/subsai/webui.py", line 518, in webui exported_file_path = tools.merge_subs_with_video({subs_lang: subs}, str(media_file.resolve()), exported_video_filename) File "/home/midnight/miniconda3/envs/caption/lib/python3.10/site-packages/subsai/main.py", line 299, in merge_subs_with_video ffmpeg.run(output_ffmpeg) File "/home/midnight/miniconda3/envs/caption/lib/python3.10/site-packages/ffmpeg/_run.py", line 325, in run raise Error('ffmpeg', out, err)

MidnightKittenCat commented 9 months ago

image Hopefully this helps.

MidnightKittenCat commented 9 months ago

Could we also perhaps look into this? https://huggingface.co/facebook/nllb-200-3.3B

abdeladim-s commented 9 months ago

image Hopefully this helps.

Seems like FFmpeg cannot infer the encodec format of your video file. Are you perhaps applying the function on a an audio file ? if not could you please share a sample so I can test on my end ?

MidnightKittenCat commented 9 months ago

Very sorry something came up again, I seemed to of gotten it to work by using an mp4 this time.

When it comes to the 200mb limit, is this just for displaying the video itself? If so can we have an option that if the video is >200mb that it doesn’t embed it and only gives u the merge video with subtitles/download video button? Thanks!

abdeladim-s commented 9 months ago

@MidnightKittenCat, it is already the case I think, if the video exceeds 200mb you just can't view it, but you can transcribe and merge as well.

MidnightKittenCat commented 9 months ago

Finally got around to trying this for myself, this doesn't seem to be the case.\

@MidnightKittenCat, it is already the case I think, if the video exceeds 200mb you just can't view it, but you can transcribe and merge as well.

"File must be 200.0MB or smaller." when inputting a file over 200mb, I was hopping it can be transcribed/merged without showing the video if >200mb

matusnovak commented 2 months ago

I think Pytorch supports ROCm, so it should be supported out of the box basically, but I didn't test it so I couldn't tell. You can give it a try, and let me know how it goes :

1. create a new virtual environment.

2. install Pytorch ROCm compatible
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.4.2
3. Then install `subsai` (as described in the README)

Thank you!

I had to use rocm5.7 instead of rocm5.4.2 and that did the trick for me on Manjaro Linux 6.2.16 with AMD Radeon 6800XT

insberr commented 1 month ago

I am getting RuntimeError: CUDA failed with error CUDA driver version is insufficient for CUDA runtime version. I installed pytorch with rocm6.0 and I have a RX 6800. I am also using the m-bain/whisperX model. Any ideas on how to fix this?

abdeladim-s commented 1 month ago

I am getting RuntimeError: CUDA failed with error CUDA driver version is insufficient for CUDA runtime version. I installed pytorch with rocm6.0 and I have a RX 6800. I am also using the m-bain/whisperX model. Any ideas on how to fix this?

@insberr, maybe you should downgrade rocm to a lower version as I described in the comment before, rocm5.7 worked for @matusnovak, so maybe try that one.