Issue with trying this with YouTube video, Docker, and Nvidia

NightHawkATL commented 2 weeks ago

Which OS are you using?

OS: [e.g. iOS or Windows.. If you are using Google Colab, just Colab.]
Linux/Docker

I have cloned the git and have built the image and launched the UI correctly as far as I can tel. I have it on my AI VM that I have an Nvidia M40 12GB passed through and it does say that it detected CUDA and I can see it loading the model when testing YouTube transcribing. It will error out and stop the container once it tries to create the file. I did see inanother issue where someone was having a similar issue and you suggested they change the compute type. I only have "float32" as an option for my setup. As it is on the same VM as my Ollama and InvokeAI setups, those have access to the GPUs and are currently not in use. here is my compose:


  app:
    # build: .
    image: jhj0517/whisper-webui:latest
    container_name: whisper_webui
    volumes:
      # Update paths to mount models and output paths to your custom paths:
      - /portainer/Files/AppData/Config/Whisper-WebUI/models:/Whisper-WebUI/models
      - /portainer/Files/AppData/Config/Whisper-WebUI/outputs:/Whisper-WebUI/outputs
      - /portainer/Files/AppData/Config/Whisper-WebUI/configs:/Whisper-WebUI/configs

    ports:
      - "7860:7860"

    stdin_open: true
    tty: true

    entrypoint: ["python", "app.py", "--server_port", "7860", "--server_name", "0.0.0.0",]

    # If you're not using nvidia GPU, Update device to match yours.
    # See more info at : https://docs.docker.com/compose/compose-file/deploy/#driver
# GPU support
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              device_ids: ['0']
              capabilities:
                - compute
                - utility
                - gpu```

NightHawkATL commented 2 weeks ago

jhj0517 commented 2 weeks ago

Hi. I don't know why ctranslate2.get_supported_compute_types("cuda") returns only float32 for your environment. I think it's probably a bug in ctranslate2.

For now, I have just allowed custom values in #312, as the error message says.

But you may have to manually enter the float16 in the Dropdown as you did.

NightHawkATL commented 2 weeks ago

Loaded the new image and I am still getting an error.

NightHawkATL commented 2 weeks ago

The model seems to load 98MiB into the GPU and just stops.

dng-nguyn commented 2 weeks ago

float16 requires Maxwell architecture with Compute Capability of 5.3 or above, while Tesla M40 only supports 5.2

jhj0517 commented 2 weeks ago

faster-whisper now needs CUDA version atleast 12.1.

You can see CUDA version compatibility with your GPU here:

Compute Capability : https://developer.nvidia.com/cuda-gpus
Microarchitecture vs CUDA version : https://forums.developer.nvidia.com/t/cuda-compatibility-between-nvidia-rtx-a5000-and-geforce-rtx-4060-ti/264216

As @dng-nguyn said, it seems that M40 supports only for really old version of CUDA, and it may require to some strugglings to setup.

Using just the CPU may be a better choice, although it's slower.

NightHawkATL commented 2 weeks ago

I am running CUDA 12.6 as shown in my nvidia-smi screenshot from earlier. I guess if your container can't support my card, I will just not use it until I can afford a better card.

dng-nguyn commented 2 weeks ago

The card is supported in CUDA 12 though (Maxwell microarchitecture), just not float16

Have you tried running it bare-metal? Or change whisper model to openai's.

Since you're manually building the image this may help you.

jhj0517 commented 2 weeks ago

@NightHawkATL Ah, sorry. I misread the table. Tesla M40 supports CUDA 12.x.

Could not load library libcudnn_ops_infer.so.8

I didn't notice that. Lately github doesn't allow me to open the image in the new tab and it makes it difficult to read small image. It seems the same with #271. If your OS is Windows, then I recommend using Purfview's solution,

But since yours is Linux, the comment @dng-nguyn pointed out would help!

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.0-1_all.deb 
dpkg -i cuda-keyring_1.0-1_all.deb 
apt update && apt upgrade
apt install libcudnn8 libcudnn8-dev

It seems that this will manually install some missing .so files.

And if it's still problematic, you can try using openai's whisper implementation, you can edit docker-compose.yaml to use some CLI lines:

https://github.com/jhj0517/Whisper-WebUI/blob/f3f351e561fda6292cf6c31645df2e3e8a37c729/docker-compose.yaml#L19

to

entrypoint: ["python", "app.py", "--server_port", "7860", "--server_name", "0.0.0.0", "--whisper_type", "whisper"]

jhj0517 commented 2 weeks ago

Investigated issue.

Could not load library libcudnn_ops_infer.so.8

Since this was about version incompatibility between faster-whisper (CTranslate2) and torch >= 2.4.0 I downgraded torch in #318.

So the new image works fine now! @NightHawkATL

If you still face the same bug, please feel free to re-open!

jhj0517 / Whisper-WebUI

Issue with trying this with YouTube video, Docker, and Nvidia #311