JeffWDH / tarball.ca-comments

https://www.tarball.ca post comments
https://www.tarball.ca
0 stars 0 forks source link

posts/home-assistant-wyoming-whisper-cuda-gpu-support/ #3

Closed utterances-bot closed 2 weeks ago

utterances-bot commented 1 year ago

Home Assistant - Enabling CUDA GPU support for Wyoming Whisper Docker container | tarball.ca

How to enable CUDA GPU support for the Wyoming Whisper Docker container using Docker Compose.

https://www.tarball.ca/posts/home-assistant-wyoming-whisper-cuda-gpu-support/

ab-tools commented 1 year ago

Hello,

first thanks a lot for providing this guide - I'm also trying to run Whisper with CUDA support.

But after following your guide carefully, I get following error message from Whisper in the logs:

whisper  | Traceback (most recent call last):
whisper  |   File "/usr/lib/python3.9/runpy.py", line 197, in _run_module_as_main
whisper  |     return _run_code(code, main_globals, None,
whisper  |   File "/usr/lib/python3.9/runpy.py", line 87, in _run_code
whisper  |     exec(code, run_globals)
whisper  |   File "/usr/local/lib/python3.9/dist-packages/wyoming_faster_whisper/__main__.py", line 136, in <module>
whisper  |     asyncio.run(main())
whisper  |   File "/usr/lib/python3.9/asyncio/runners.py", line 44, in run
whisper  |     return loop.run_until_complete(main)
whisper  |   File "/usr/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete
whisper  |     return future.result()
whisper  |   File "/usr/local/lib/python3.9/dist-packages/wyoming_faster_whisper/__main__.py", line 112, in main
whisper  |     whisper_model = WhisperModel(
whisper  |   File "/usr/local/lib/python3.9/dist-packages/wyoming_faster_whisper/faster_whisper/transcribe.py", line 58, in __init__
whisper  |     self.model = ctranslate2.models.Whisper(
whisper  | ValueError: This CTranslate2 package was not compiled with CUDA support
whisper exited with code 0

Something I'm doing wrong here?

Best regards and thanks for any tip on this Andreas

JeffWDH commented 1 year ago

Hi Andreas, are you using the same docker image? (image: rhasspy/wyoming-whisper:latest)

ab-tools commented 1 year ago

Hello Jeff,

appreciating your super quick reply, thanks a lot! :-)

And yes, I do use image: rhasspy/wyoming-whisper:latest, too.

Not sure if the CPU architecture matters? It's aarch64/arm64.

Best regards Andreas

JeffWDH commented 1 year ago

Unfortunately it looks like the precompiled ctranslate2 package for aarch64 does not include CUDA support: https://github.com/OpenNMT/CTranslate2/issues/1306

You may be able to compile ctranslate2 on your host system and add those to the Whisper container similar to how the libcu* libraries are being added.

ab-tools commented 1 year ago

Thanks for the tip, Jeff, to my surprise compiling CTranslate2 worked on first try on my host system! :-)

I also did a make install to see where it place the library and it ends up here locally:

-- Installing: /usr/local/lib/libctranslate2.so.3.18.0
-- Installing: /usr/local/lib/libctranslate2.so.3
-- Installing: /usr/local/lib/libctranslate2.so

Now the question is just how to get them into the container. I've tried it with the volumes as you suggested, but even after I added all options (I think) as in the following, I still get the same error message that CUDE would not be supported:

      - /usr/local/lib/libctranslate2.so.3.18.0:/usr/local/lib/libctranslate2.so.3.18.0
      - /usr/local/lib/libctranslate2.so.3:/usr/local/lib/libctranslate2.so.3
      - /usr/local/lib/libctranslate2.so:/usr/local/lib/libctranslate2.so
      - /usr/local/lib/libctranslate2.so.3.18.0:/usr/lib/aarch64-linux-gnu/libctranslate2.so.3.18.0
      - /usr/local/lib/libctranslate2.so.3:/usr/lib/aarch64-linux-gnu/libctranslate2.so.3
      - /usr/local/lib/libctranslate2.so:/usr/lib/aarch64-linux-gnu/libctranslate2.so

So it seems that it just ignores the newly compiled library and uses the "wrong" one.

Any idea how I can get the library "at the right place" in the docker container? How did you find out where these libraries need to go to within the container file system?

Best regards and thanks again for your efforts to help Andreas

JeffWDH commented 1 year ago
# docker exec -it whisper find / -name *ctranslate*
/usr/local/lib/python3.9/dist-packages/ctranslate2
/usr/local/lib/python3.9/dist-packages/ctranslate2.libs
/usr/local/lib/python3.9/dist-packages/ctranslate2.libs/libctranslate2-d86e33f3.so.3.17.1
/usr/local/lib/python3.9/dist-packages/ctranslate2-3.17.1.dist-info

Looks like the library bundled in is located at /usr/local/lib/python3.9/dist-packages/ctranslate2.libs/libctranslate2-d86e33f3.so.3.17.1... I suppose if you did something like this it would override it with your compiled version:

- /usr/local/lib/libctranslate2.so.3:/usr/local/lib/python3.9/dist-packages/ctranslate2.libs/libctranslate2-d86e33f3.so.3.17.1:ro

There's probably a tidier/better way to do all of this but I would think this would work - at least until the image gets updated with a newer version of the library.

ab-tools commented 1 year ago

Thanks a lot, Jeff!

Using the find command I found a few more and just replaced all:

      - /usr/local/lib/libctranslate2.so.3.18.0:/usr/local/lib/python3.9/dist-packages/ctranslate2.libs/libctranslate2-22d91c88.so.3.17.1:ro
      - /usr/local/lib/libctranslate2.so.3.18.0:/usr/local/lib/python3.9/dist-packages/ctranslate2.libs/libctranslate2-d86e33f3.so.3.17.1:ro
      - /usr/local/lib/libctranslate2.so.3.18.0:/data/CTranslate2/libctranslate2.so.3:ro
      - /usr/local/lib/libctranslate2.so.3.18.0:/data/CTranslate2/libctranslate2.so:ro
      - /usr/local/lib/libctranslate2.so.3.18.0:/data/CTranslate2/libctranslate2.so.3.18.0:ro

And indeed it seems to be a step further now as I do get a different error message which is a start: :-)

whisper  | Traceback (most recent call last):
whisper  |   File "/usr/lib/python3.9/runpy.py", line 197, in _run_module_as_main
whisper  |     return _run_code(code, main_globals, None,
whisper  |   File "/usr/lib/python3.9/runpy.py", line 87, in _run_code
whisper  |     exec(code, run_globals)
whisper  |   File "/usr/local/lib/python3.9/dist-packages/wyoming_faster_whisper/__main__.py", line 14, in <module>
whisper  |     from .faster_whisper import WhisperModel
whisper  |   File "/usr/local/lib/python3.9/dist-packages/wyoming_faster_whisper/faster_whisper/__init__.py", line 1, in <module>
whisper  |     from .transcribe import WhisperModel
whisper  |   File "/usr/local/lib/python3.9/dist-packages/wyoming_faster_whisper/faster_whisper/transcribe.py", line 5, in <module>
whisper  |     import ctranslate2
whisper  |   File "/usr/local/lib/python3.9/dist-packages/ctranslate2/__init__.py", line 21, in <module>
whisper  |     from ctranslate2._ext import (
whisper  | ImportError: libomp.so.5: cannot open shared object file: No such file or directory
whisper exited with code 0

I wonder if this could be due to the new CTranslate2 version? I'm right now compiling CTranslate2 a second time using the tag of version 3.7.1 which is the same one installed in that docker container...

ab-tools commented 1 year ago

Unfortunately, also with version 3.7.1 I still get the same error: ImportError: libomp.so.5: cannot open shared object file: No such file or directory

When searching the internet for this error, people say to just run sudo apt-get install libomp-dev. Not sure, is this also possible within an existent Docker container or does the container need to be re-created from scratch for this?

Honestly, I'm not so experienced with Docker containers so far (besides "just using" them).

JeffWDH commented 1 year ago

You could try: docker exec -it whisper apt-get install libomp-dev

ab-tools commented 1 year ago

Thought about this as well and, generally, I can install it like this (with an apt update before), but when I restart the container again, the installation seems to be gone.

And I do need to restart it at least once and change the docker-compose.yaml to add the volumes for the libctranslate2.so, because if I add them BEFORE I install this library I get the error above and as a result the container is in an endless restart loop which does not allow me to execute any commands, because it is restarting all the time...

ab-tools commented 1 year ago

OK, this was harder than I anticipated, but I got it to work! :-)

I've created now an own CUDA enabled docker image: https://hub.docker.com/r/abtools/wyoming-whisper-cuda

As it does include the NVIDIA CUDA packages itself (which makes it pretty big though), also the additional volumes are no longer needed.

JeffWDH commented 1 year ago

Great work! I've updated the post to point people towards your image.

sl33pydog commented 1 year ago

Can you expand more on your steps. I've been trying to get docker on ubuntu 20.04 on an intel i5 with an nvidia t1000 gpu to work but I keep running into libcublas and libcublasLt issues. I installed cuda toolkit 12.3, cudnn for 12.x, Driver Version: 535.113.01 and libcudnn8. I basically followed https://developer.download.nvidia.com/compute/cuda/11.3.0/local_installers/cuda-repo-ubuntu2004-11-3-local_11.3.0-465.19.01-1_amd64.deb. I'm piping from a separate VM from home assistant using wyoming to this docker container. The probelm though is I get this error: "Could not load library libcublasLt.so.12. Error: libcublasLt.so.12: cannot open shared object file: No such file or directory"

most of the libraries are located where your docker compose has them but the libcublas stuff managed to be in /usr/local/cuda/lib64. I tried updating the compose to be like - /usr/local/cuda/lib64/libcublasLt.so.12 :/usr/local/cuda/lib64/libcublasLt.so.12:ro but it didn't work. Thoughts?

ab-tools commented 1 year ago

If possible, I would recommend starting with a prepared NVIDIA CUDA-enabled Docker image and build on top of that - that's what I've done as well.

Depending on the architecture you are using there should be pre-build images available from NVIDIA: https://hub.docker.com/r/nvidia/cuda https://catalog.ngc.nvidia.com/orgs/nvidia/containers/cuda

On top of that NVIDIA image you can add then the remaining needed packages like Whisper itself.

sl33pydog commented 1 year ago

@ad-tools is it possible to turn your version into something that's compatible with an x86-64 processor? I guess is there a way I can reverse what you did and fit the missing blocks in? Sorry I haven't learned about containerization yet.

jnvd3b commented 1 year ago

I got the same error sl33pydog. I removed the :ro from the volume mapping and that did it. It kept telling me a file wasn't there when I could see very clearly that it was, so I removed :ro and it worked immediately.

sl33pydog commented 1 year ago

Unfortunately, that didn't work for me @jnvd3b. I think my problem is that whatever way I install all the requirements they never all install in the the same folder as described in the docker-compose file. Like sometimes cudnn installs in the right folder but them the cublas files install in the cuda directory under /use/local.

jnvd3b commented 1 year ago

ah, I copied them over into the same directory, as when I unpacked the cudnn it wasn't in the 'right' spot.

sl33pydog commented 1 year ago

@jnvd3b that got me past that issue but now it seems my cudnn is not initializing. Seems each time I talk to you all I get a step closer. Gonna start from scratch again to see if I can at least get that to work on it's on in docker.
ERROR:asyncio:Task exception was never retrieved future: <Task finished name='Task-5' coro=<AsyncEventHandler.run() done, defined at /usr/local/lib/python3.9/dist-packages/wyoming/server.py:28> exception=RuntimeError('cuDNN failed with status CUDNN_STATUS_NOT_INITIALIZED')> Traceback (most recent call last): File "/usr/local/lib/python3.9/dist-packages/wyoming/server.py", line 35, in run if not (await self.handle_event(event)): File "/usr/local/lib/python3.9/dist-packages/wyoming_faster_whisper/handler.py", line 75, in handle_event text = " ".join(segment.text for segment in segments) File "/usr/local/lib/python3.9/dist-packages/wyoming_faster_whisper/handler.py", line 75, in text = " ".join(segment.text for segment in segments) File "/usr/local/lib/python3.9/dist-packages/wyoming_faster_whisper/faster_whisper/transcribe.py", line 162, in generate_segments for start, end, tokens in tokenized_segments: File "/usr/local/lib/python3.9/dist-packages/wyoming_faster_whisper/faster_whisper/transcribe.py", line 186, in generate_tokenized_segments result, temperature = self.generate_with_fallback(segment, prompt, options) File "/usr/local/lib/python3.9/dist-packages/wyoming_faster_whisper/faster_whisper/transcribe.py", line 279, in generate_with_fallback result = self.model.generate( RuntimeError: cuDNN failed with status CUDNN_STATUS_NOT_INITIALIZED

sl33pydog commented 1 year ago

Success. Installed the latest cudnn for 11.x and it now works. I can see it working on nvtop now. Thank you all that helped me. Now I just need to write down my instructions so I can bring in other ai gpu tools. It's really fast now compared to CPU mode. Running on an older i5 and a T1000 gpu.

sl33pydog commented 1 year ago

For those going down a similar path of trying to get faster-whisper and codeproject ai in docker using the gpu using the method above I found that Cuda 11.8 and the latest cudnn works perfectly and on a 4tb graphics card has good head room left over.

derhappy commented 1 year ago

@ab-tools: I've tried your docker container, but unfortunately it fails for me. The container seems to be running, but the log says exec /usr/bin/bash: exec format error 13 times. Also the integration is not available in Home Assistant anymore and esphome returns [E][voice_assistant:614]: Error: stt-stream-failed - speech-to-text failed It does not seem to do anything. The folder assigned to the container stays empty, so it does not load the model. Choosing the folder with the model downloaded by the "normal" container doesn't help.

The nvidia / container installation seems to be working, at least superficially. I have another container (frigate) using the gpu without issues, and sudo docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi works fine as well.

here's my compose:

  whisper:
    container_name: whisper
    image: abtools/wyoming-whisper-cuda:latest
    restart: unless-stopped
    command: --model small-int8 --language de --beam-size 5 --device cuda
    volumes:
      - /lightning/lightning/apps/portainer/data/whisper:/data
    environment:
      - TZ=Europe/Berlin
    ports:
      - 10300:10300
    networks:
      - default
    runtime: nvidia
    deploy:
        resources:
          reservations:
            devices:
              - driver: nvidia
                count: 1
                capabilities: [gpu]

I'd be really happy if you have a clue what's wrong.

justynbell commented 1 year ago

@derhappy: His container only applies to the ARM architecture. If you're trying to run it on an x86 (or other) architecture, it'll fail with teh exec format error error.

@JeffWDH I was slightly confused at first about that as well. @ab-tools's image does not replace your guide. You still need to follow, and pass those libraries through if you want to run the x86 container provided by HA.

RandomLegend commented 11 months ago

@justynbell i just stumbled into the same pit as @derhappy by trying to run this on x86

Does someone know a way to run this on x86 with GPU support?

jerblack commented 10 months ago

@ab-tools can you please share the Dockerfile you used to build this image so we can adapt it for x86?

ab-tools commented 10 months ago

Hello @jerblack,

sure, that's the content of the Dockerfile I used:

FROM nvcr.io/nvidia/l4t-jetpack:r35.3.1

# Install Whisper
WORKDIR /usr/src
ARG WHISPER_VERSION='1.0.1'

RUN \
    apt-get update \
    && apt-get install -y --no-install-recommends \
        build-essential \
        python3 \
        python3-dev \
        python3-pip \
        libomp-dev \
        wget \
    \
    && pip3 install --no-cache-dir -U \
        setuptools \
        wheel \
    && cd /tmp \
    && wget https://it-project-planning.com/api/libctranslate2.so.3.17.1 \
    && mv libctranslate2.so.3.17.1 /usr/lib/aarch64-linux-gnu/libctranslate2.so.3 \
    && wget https://it-project-planning.com/api/ctranslate2-3.17.1-cp38-cp38-linux_aarch64.whl \
    && pip install -U wheel ctranslate2-3.17.1-cp38-cp38-linux_aarch64.whl \
    && rm ctranslate2-3.17.1-cp38-cp38-linux_aarch64.whl \
    && pip3 install --no-cache-dir \
        "wyoming-faster-whisper==${WHISPER_VERSION}" \
    \
    && apt-get purge -y --auto-remove \
        build-essential \
        python3-dev \
        wget \
    && rm -rf /var/lib/apt/lists/*

WORKDIR /
COPY run.sh ./

EXPOSE 10300

ENTRYPOINT ["bash", "/run.sh"]
dwyschka commented 9 months ago

Thank you for your image @ab-tools, but on x86 its not working. your base image is from nvidia for arm.

if somebody is searching for an alternative, ive build an image and uploaded it on dockerhub! https://hub.docker.com/r/dwyschka/wyoming-whisper-cuda

Dockerfile:

FROM pytorch/pytorch:2.1.0-cuda11.8-cudnn8-runtime

# Install Whisper
WORKDIR /usr/src
ARG WHISPER_VERSION='1.0.1'

RUN \
    apt-get update \
    && apt-get install -y --no-install-recommends \
        build-essential \
        python3 \
        python3-dev \
        python3-pip \
        libomp-dev \
        wget \
    \
    && pip3 install --no-cache-dir -U \
        setuptools \
        wheel \
    && pip3 install --no-cache-dir \
        "wyoming-faster-whisper==${WHISPER_VERSION}" \
    \
    && apt-get purge -y --auto-remove \
        build-essential \
        python3-dev \
        wget \
    && rm -rf /var/lib/apt/lists/*

WORKDIR /
COPY run.sh ./
ENV LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/opt/conda/lib/python3.10/site-packages/torch/lib
EXPOSE 10300

ENTRYPOINT ["bash", "/run.sh"]
chriszero commented 9 months ago

An alternative to build an image for an jetson cuda device, is using https://github.com/dusty-nv/jetson-containers

Builds the correct image, based on Jetpack version. I'm using this on a NVIDIA Jetson Orion Nano NX 8GB with Ubuntu 22.04 and Jetpack 6 / L4T 36.2.0

Clone the repo on the jetson device Add a new package (directory) "whyoming-faster-wisper Create a Dockerfile inside the new directory

Dockerfile:

#---
# name: wyoming-faster-whisper
# group: audio
# depends: [cuda, cudnn, ctranslate2]
# requires: '>=34.1.0'
# docs: docs.md
#---

ARG BASE_IMAGE
FROM ${BASE_IMAGE}

# Install Whisper
WORKDIR /usr/src
ARG WHISPER_VERSION='1.0.1'

RUN \
    apt-get update \
    && apt-get install -y --no-install-recommends \
        build-essential \
        python3 \
        python3-dev \
        python3-pip \
        libomp-dev \
        wget \
        libgomp1 \
        libcublas-12-2 \
    \
    && pip3 install --no-cache-dir -U \
        setuptools \
        wheel \
    && pip3 install --no-cache-dir \
        "wyoming-faster-whisper==${WHISPER_VERSION}" \
    \
    && apt-get purge -y --auto-remove \
        build-essential \
        python3-dev \
        wget \
    && rm -rf /var/lib/apt/lists/*

WORKDIR /

EXPOSE 10300

Build the image with:

build.sh  --skip-tests=all wyoming-faster-whisper

Start with docker compose:

compose.yaml

services:
  whisper:
    container_name: whisper-cuda
    image: wyoming-faster-whisper:r36.2.0 #docker image ls
    command: python3 -m wyoming_faster_whisper --uri 'tcp://0.0.0.0:10300' --model small-int8 --language de --beam-size 5 --device cuda --data-dir /data
    volumes:
      - /data/whisper:/data
    restart: always
    ports:
      - 10300:10300
    runtime: nvidia
    deploy:
        resources:
          reservations:
            devices:
              - driver: nvidia
                count: 1
                capabilities: [gpu]
pimp1310 commented 7 months ago

Hi,

i have a amd64 architecture, running on omv6 with activated Cuda and Nvidia Drivers for Docker. and i want to use whisper with Gpu, whats now the correct way?

can i use any whisper image for my amd64 architecture and activate under ENVIROMENT "Nvidia" ? and the rest of parameters (devices)?

or musst i build whisper with cuda support?

felixmartens commented 7 months ago

@pimp1310 i used https://github.com/Cheerpipe/faster-whisper-cuda-docker but changed nvidia image to nvidia/cuda:11.8.0-cudnn8-runtime-ubuntu22.04. So Base Image is from nvidia with cuda built-in and installed faster-whisper via python