[BUG] [GPU] Could not load library libcudnn_ops_infer.so.8. Error: libcudnn_ops_infer.so.8: cannot open shared object file: No such file or directory

andreaalloway commented 5 months ago

Is there an existing issue for this?

[X] I have searched the existing issues

Current Behavior

I'm using the lscr.io/linuxserver/faster-whisper:gpu and I'm encountering issues where any Wyoming prompt results in the following error:

Could not load library libcudnn_ops_infer.so.8. Error: libcudnn_ops_infer.so.8: cannot open shared object file: No such file or directory

It appears to be related to this behavior in faster-whisper https://github.com/SYSTRAN/faster-whisper/issues/516

Expected Behavior

faster-whisper is able to use the GPU to parse speech to text

Steps To Reproduce

Setup the faster-whisper docker container per below Added faster-whisper to Home Assistant using the Wyoming protocol Setup a Raspberry PI 3+ with wyoming-satellite per https://github.com/rhasspy/wyoming-satellite/blob/master/docs/tutorial_installer.md Prompts are responded (local wyoming-wakeword.service) to but in the logs on the docker container indicate an error

Logs for docker container lscr.io/linuxserver/faster-whisper:gpu

INFO:faster_whisper:Processing audio with duration 00:15.000
Could not load library libcudnn_ops_infer.so.8. Error: libcudnn_ops_infer.so.8: cannot open shared object file: No such file or directory

Logs for wyoming-satellite.service

run[807]: WARNING:root:Event(type='error', data={'text': 'speech-to-text failed', 'code': 'stt-stream-failed'}, payload=None)

Environment

- OS: Centos Stream 8 using the kernel-ml module
Linux 6.5.6-1.el8.elrepo.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Oct  6 17:10:59 EDT 2023 x86_64 x86_64 x86_64 GNU/Linux
- How docker service was installed:
yum install -y yum-utils
yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
yum erase podman buildah
yum install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin docker-compose
wget 'http://international.download.nvidia.com/XFree86/Linux-x86_64/550.67/NVIDIA-Linux-x86_64-550.67.run'
chmod +x NVIDIA-Linux-x86_64-550.67.run
./NVIDIA-Linux-x86_64-550.67.run
curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo |   sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo
yum install -y nvidia-container-toolkit
nvidia-ctk runtime configure --runtime=docker

nvidia-smi
Sat Apr 13 21:29:22 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.67                 Driver Version: 550.67         CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce GTX 1080        Off |   00000000:05:00.0 Off |                  N/A |
|  0%   31C    P8              8W /  180W |    6487MiB /   8192MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A   1731864      C   /app/.venv/bin/python                        3244MiB |
|    0   N/A  N/A   1737066      C   python3                                      3240MiB |
+-----------------------------------------------------------------------------------------+


### CPU architecture

x86-64

### Docker creation

```bash
version: '3.8'
services:
  faster-whisper:
    image: lscr.io/linuxserver/faster-whisper:gpu
    container_name: faster-whisper
    restart: always
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]
    environment:
      - PUID=<REDACTED>
      - PGID=<REDACTED>
      - TZ=<REDACTED>
      - WHISPER_MODEL=medium
      - WHISPER_BEAM=1 #optional
      - WHISPER_LANG=en #optional
    volumes:
      - /path/to/docker/whisper/config/:/config
    ports:
      - 10300:10300
    runtime: nvidia
    networks:
      swag_default:


### Container logs

```bash
[custom-init] No custom files found, skipping...
[2024-04-13 21:18:16.336] [ctranslate2] [thread 153] [warning] The compute type inferred from the saved model is float16, but the target device or backend do not support efficient float16 computation. The model weights have been automatically converted to use the float32 compute type instead.
INFO:__main__:Ready
[ls.io-init] done.
INFO:faster_whisper:Processing audio with duration 00:15.000
Could not load library libcudnn_ops_infer.so.8. Error: libcudnn_ops_infer.so.8: cannot open shared object file: No such file or directory
[2024-04-13 21:21:04.052] [ctranslate2] [thread 223] [warning] The compute type inferred from the saved model is float16, but the target device or backend do not support efficient float16 computation. The model weights have been automatically converted to use the float32 compute type instead.
INFO:__main__:Ready

github-actions[bot] commented 5 months ago

Thanks for opening your first issue here! Be sure to follow the relevant issue templates, or risk having this issue marked as invalid.

andreaalloway commented 5 months ago

As a work around it appears that I can do the following. Log into the container using

docker exec -it faster-whisper /bin/bash

Install torch

pip install torch --index-url https://download.pytorch.org/whl/cu121

exit the container bash

Create a .bashrc file under the /config directory (vim is not installed on the container so I used the host for this)

vim config/.bashrc

with the contents:

export LD_LIBRARY_PATH=`python3 -c 'import os; import nvidia.cublas.lib; import nvidia.cudnn.lib; import torch; print(os.path.dirname(nvidia.cublas.lib.__file__) + ":" + os.path.dirname(nvidia.cudnn.lib.__file__) + ":" + os.path.dirname(torch.__file__) +"/lib")'`:$LD_LIBRARY_PATH

Then restarted my container

docker restart faster-whisper

PontyJohnty commented 5 months ago

As a work around it appears that I can do the following. Log into the container using
docker exec -it faster-whisper /bin/bash
Install torch
pip install torch --index-url https://download.pytorch.org/whl/cu121
exit the container bash

Create a .bashrc file under the /config directory (vim is not installed on the container so I used the host for this)
vim config/.bashrc
with the contents:
export LD_LIBRARY_PATH=`python3 -c 'import os; import nvidia.cublas.lib; import nvidia.cudnn.lib; import torch; print(os.path.dirname(nvidia.cublas.lib.__file__) + ":" + os.path.dirname(nvidia.cudnn.lib.__file__) + ":" + os.path.dirname(torch.__file__) +"/lib")'`:$LD_LIBRARY_PATH
Then restarted my container
docker restart faster-whisper

This worked for me too. Thank you for the suggestion.

LinuxServer-CI commented 4 months ago

This issue has been automatically marked as stale because it has not had recent activity. This might be due to missing feedback from OP. It will be closed if no further activity occurs. Thank you for your contributions.

LoghamLogan commented 4 months ago

I had this same issue, I'm no expert but is this because the docker file is installing libs for cu11 rather than cu12? (So it breaks for anyone using CUDA v12) Would be nice if this could get fixed to avoid the need for the workaround suggested above.

aptalca commented 4 months ago

upstream project wants cu11 iirc

thespad commented 4 months ago

It looks like upstream has switched the default recommendation to CUDA 12 https://github.com/SYSTRAN/faster-whisper/commit/3d1de60ef3ce7d34f7c0ae6547f8a616aa060ac2, with the caveat that this may break some CUDA 11 setups, but I don't think we can win on that because the same version of ctranslate2 won't support both 11 and 12 and I don't really want a) A 5Gb+ image or b) two different branches for different versions.

thespad commented 4 months ago

Also looks like nvidia-cudnn-cu12 version 9+ has issues, so it's going to need pinning

thespad commented 4 months ago

Please try ghcr.io/linuxserver/lspipepr-faster-whisper:gpu-2.0.0-pkg-c801351f-dev-4db4a97b3e161472da9c546387db12b39d05a816-pr-16 and see if it resolves your issues.

andreaalloway commented 4 months ago

Please try ghcr.io/linuxserver/lspipepr-faster-whisper:gpu-2.0.0-pkg-c801351f-dev-4db4a97b3e161472da9c546387db12b39d05a816-pr-16 and see if it resolves your issues.

This version appears to be working without the .bashrc work around

thespad commented 4 months ago

PR has been merged, new image should be built in the next ~30 mins.

richardoswald commented 4 months ago

    {
        "Id": "5a7daaf35afd582695e2a7334f4df4300568706c021a94dc36b8570f43fa578b",
        "Created": "2024-05-19T23:19:00.278825074Z",
        "Path": "/init",
        "Args": [],
        "State": {
            "Status": "running",
            "Running": true,
            "Paused": false,
            "Restarting": false,
            "OOMKilled": false,
            "Dead": false,
            "Pid": 31380,
            "ExitCode": 0,
            "Error": "",
            "StartedAt": "2024-05-19T23:19:00.622553426Z",
            "FinishedAt": "0001-01-01T00:00:00Z"
        },
        "Image": "sha256:d21f6ea99e039c4c747462439217435ae5dda8a05de3c9d36d9c3fdd9a77eadb",
        "ResolvConfPath": "/var/lib/docker/containers/5a7daaf35afd582695e2a7334f4df4300568706c021a94dc36b8570f43fa578b/resolv.conf",
        "HostnamePath": "/var/lib/docker/containers/5a7daaf35afd582695e2a7334f4df4300568706c021a94dc36b8570f43fa578b/hostname",
        "HostsPath": "/var/lib/docker/containers/5a7daaf35afd582695e2a7334f4df4300568706c021a94dc36b8570f43fa578b/hosts",
        "LogPath": "/var/lib/docker/containers/5a7daaf35afd582695e2a7334f4df4300568706c021a94dc36b8570f43fa578b/5a7daaf35afd582695e2a7334f4df4300568706c021a94dc36b8570f43fa578b-json.log",
        "Name": "/faster-whisper",
        "RestartCount": 0,
        "Driver": "btrfs",
        "Platform": "linux",
        "MountLabel": "",
        "ProcessLabel": "",
        "AppArmorProfile": "",
        "ExecIDs": null,
        "HostConfig": {
            "Binds": [
                "/mnt/cache/appdata/faster-whisper:/config:rw"
            ],
            "ContainerIDFile": "",
            "LogConfig": {
                "Type": "json-file",
                "Config": {
                    "max-file": "1",
                    "max-size": "50m"
                }
            },
            "NetworkMode": "br0.20",
            "PortBindings": {},
            "RestartPolicy": {
                "Name": "no",
                "MaximumRetryCount": 0
            },
            "AutoRemove": false,
            "VolumeDriver": "",
            "VolumesFrom": null,
            "ConsoleSize": [
                0,
                0
            ],
            "CapAdd": null,
            "CapDrop": null,
            "CgroupnsMode": "private",
            "Dns": [
                "10.0.20.1"
            ],
            "DnsOptions": [],
            "DnsSearch": [],
            "ExtraHosts": null,
            "GroupAdd": null,
            "IpcMode": "private",
            "Cgroup": "",
            "Links": null,
            "OomScoreAdj": 0,
            "PidMode": "",
            "Privileged": false,
            "PublishAllPorts": false,
            "ReadonlyRootfs": false,
            "SecurityOpt": null,
            "UTSMode": "",
            "UsernsMode": "",
            "ShmSize": 67108864,
            **"Runtime": "nvidia",**
            "Isolation": "",
            "CpuShares": 0,
            "Memory": 0,
            "NanoCpus": 0,
            "CgroupParent": "",
            "BlkioWeight": 0,
            "BlkioWeightDevice": [],
            "BlkioDeviceReadBps": [],
            "BlkioDeviceWriteBps": [],
            "BlkioDeviceReadIOps": [],
            "BlkioDeviceWriteIOps": [],
            "CpuPeriod": 0,
            "CpuQuota": 0,
            "CpuRealtimePeriod": 0,
            "CpuRealtimeRuntime": 0,
            "CpusetCpus": "",
            "CpusetMems": "",
            "Devices": [],
            "DeviceCgroupRules": null,
            **"DeviceRequests": [
                {
                    "Driver": "",
                    "Count": -1,
                    "DeviceIDs": null,
                    "Capabilities": [
                        [
                            "gpu"
                        ]
                    ],
                    "Options": {}
                }
            ],**
            "MemoryReservation": 0,
            "MemorySwap": 0,
            "MemorySwappiness": null,
            "OomKillDisable": null,
            "PidsLimit": null,
            "Ulimits": null,
            "CpuCount": 0,
            "CpuPercent": 0,
            "IOMaximumIOps": 0,
            "IOMaximumBandwidth": 0,
            "MaskedPaths": [
                "/proc/asound",
                "/proc/acpi",
                "/proc/kcore",
                "/proc/keys",
                "/proc/latency_stats",
                "/proc/timer_list",
                "/proc/timer_stats",
                "/proc/sched_debug",
                "/proc/scsi",
                "/sys/firmware",
                "/sys/devices/virtual/powercap"
            ],
            "ReadonlyPaths": [
                "/proc/bus",
                "/proc/fs",
                "/proc/irq",
                "/proc/sys",
                "/proc/sysrq-trigger"
            ]
        },
        "GraphDriver": {
            "Data": null,
            "Name": "btrfs"
        },
        "Mounts": [
            {
                "Type": "bind",
                "Source": "/mnt/cache/appdata/faster-whisper",
                "Destination": "/config",
                "Mode": "rw",
                "RW": true,
                "Propagation": "rprivate"
            }
        ],
        "Config": {
            "Hostname": "5a7daaf35afd",
            "Domainname": "",
            "User": "",
            "AttachStdin": false,
            "AttachStdout": false,
            "AttachStderr": false,
            "ExposedPorts": {
                "10300/tcp": {}
            },
            "Tty": false,
            "OpenStdin": false,
            "StdinOnce": false,
            "Env": [
                "PUID=99",
                "UMASK=022",
                "HOST_OS=Unraid",
                "HOST_HOSTNAME=zuse",
                "HOST_CONTAINERNAME=faster-whisper",
                "TCP_PORT_10300=10300",
                "WHISPER_MODEL=tiny-int8",
                "PGID=100",
                "TZ=America/Chicago",
                "WHISPER_BEAM=1",
                "WHISPER_LANG=en",
                **"NVIDIA_DRIVER_CAPABILITIES'=gpu",**
                **"NVIDIA_VISIBLE_DEVICES=GPU-4fcc04e7-23a5-2aa8-96e5-76facc3844bc",**
                "PATH=/lsiopy/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
                "HOME=/config",
                "LANGUAGE=en_US.UTF-8",
                "LANG=en_US.UTF-8",
                "TERM=xterm",
                "S6_CMD_WAIT_FOR_SERVICES_MAXTIME=0",
                "S6_VERBOSITY=1",
                "S6_STAGE2_HOOK=/docker-mods",
                "VIRTUAL_ENV=/lsiopy",
                "LSIO_FIRST_PARTY=true"
            ],
            "Cmd": null,
            "Image": "ghcr.io/linuxserver/lspipepr-faster-whisper:gpu-2.0.0-pkg-c801351f-dev-4db4a97b3e161472da9c546387db12b39d05a816-pr-16",
            "Volumes": {
                "/config": {}
            },
            "WorkingDir": "/",
            "Entrypoint": [
                "/init"
            ],
            "OnBuild": null,
            "Labels": {
                "build_version": "Linuxserver.io version:- 2.0.0-pkg-c801351f-dev-4db4a97b3e161472da9c546387db12b39d05a816-pr-16 Build-date:- 2024-05-19T15:21:39+00:00",
                "maintainer": "thespad",
                "net.unraid.docker.icon": "https://raw.githubusercontent.com/linuxserver/docker-templates/master/linuxserver.io/img/linuxserver-ls-logo.png",
                "net.unraid.docker.managed": "dockerman",
                "org.opencontainers.image.authors": "linuxserver.io",
                "org.opencontainers.image.created": "2024-05-19T15:21:39+00:00",
                "org.opencontainers.image.description": "[Faster-whisper](https://github.com/SYSTRAN/faster-whisper) is a reimplementation of OpenAI's Whisper model using CTranslate2, which is a fast inference engine for Transformer models. This container provides a Wyoming protocol server for faster-whisper.",
                "org.opencontainers.image.documentation": "https://docs.linuxserver.io/images/docker-faster-whisper",
                "org.opencontainers.image.licenses": "GPL-3.0-only",
                "org.opencontainers.image.ref.name": "4db4a97b3e161472da9c546387db12b39d05a816",
                "org.opencontainers.image.revision": "4db4a97b3e161472da9c546387db12b39d05a816",
                "org.opencontainers.image.source": "https://github.com/linuxserver/docker-faster-whisper",
                "org.opencontainers.image.title": "Faster-whisper",
                "org.opencontainers.image.url": "https://github.com/linuxserver/docker-faster-whisper/packages",
                "org.opencontainers.image.vendor": "linuxserver.io",
                "org.opencontainers.image.version": "2.0.0-ls18",
                "swag": "enable",
                "swag_port": "10300",
                "swag_url": "fw.theoswalds.com"
            }
        },
        "NetworkSettings": {
            "Bridge": "",
            "SandboxID": "4a8532f3a95ce7242da0cb9c396165aeb2078acd1a812163577270cb16bb172c",
            "HairpinMode": false,
            "LinkLocalIPv6Address": "",
            "LinkLocalIPv6PrefixLen": 0,
            "Ports": {},
            "SandboxKey": "/var/run/docker/netns/4a8532f3a95c",
            "SecondaryIPAddresses": null,
            "SecondaryIPv6Addresses": null,
            "EndpointID": "",
            "Gateway": "",
            "GlobalIPv6Address": "",
            "GlobalIPv6PrefixLen": 0,
            "IPAddress": "",
            "IPPrefixLen": 0,
            "IPv6Gateway": "",
            "MacAddress": "",
            "Networks": {
                "br0.20": {
                    "IPAMConfig": null,
                    "Links": null,
                    "Aliases": [
                        "5a7daaf35afd"
                    ],
                    "NetworkID": "083fbe85ffc8034005057ad6b3b67ba621dc6c0601e0afbad3dd37bebc35bc6e",
                    "EndpointID": "db5e2a20d843e010245619196d74af94db24ef91b28c529734df2cefcfbb8635",
                    "Gateway": "10.0.20.1",
                    "IPAddress": "10.0.20.27",
                    "IPPrefixLen": 24,
                    "IPv6Gateway": "",
                    "GlobalIPv6Address": "",
                    "GlobalIPv6PrefixLen": 0,
                    "MacAddress": "02:42:0a:00:14:1b",
                    "DriverOpts": null
                }
            }
        }
    }
]

Error

INFO:faster_whisper:Processing audio with duration 00:01.260
ERROR:asyncio:Task exception was never retrieved
future: <Task finished name='wyoming event handler' coro=<AsyncEventHandler.run() done, defined at /lsiopy/lib/python3.10/site-packages/wyoming/server.py:28> exception=RuntimeError('cuBLAS failed with status CUBLAS_STATUS_ALLOC_FAILED')>
Traceback (most recent call last):
  File "/lsiopy/lib/python3.10/site-packages/wyoming/server.py", line 35, in run
    if not (await self.handle_event(event)):
  File "/lsiopy/lib/python3.10/site-packages/wyoming_faster_whisper/handler.py", line 70, in handle_event
    text = " ".join(segment.text for segment in segments)
  File "/lsiopy/lib/python3.10/site-packages/wyoming_faster_whisper/handler.py", line 70, in <genexpr>
    text = " ".join(segment.text for segment in segments)
  File "/lsiopy/lib/python3.10/site-packages/faster_whisper/transcribe.py", line 511, in generate_segments
    encoder_output = self.encode(segment)
  File "/lsiopy/lib/python3.10/site-packages/faster_whisper/transcribe.py", line 762, in encode
    return self.model.encode(features, to_cpu=to_cpu)
RuntimeError: cuBLAS failed with status CUBLAS_STATUS_ALLOC_FAILED

I'm running into an issue with the latest version and ghcr.io/linuxserver/lspipepr-faster-whisper:gpu-2.0.0-pkg-c801351f-dev-4db4a97b3e161472da9c546387db12b39d05a816-pr-16 as well. I put all of the nvidia flags in bold that I'm using. Any thoughts?

kanjieater commented 4 months ago

I'm also getting this error with the tool i'm developing using this image.

docker run -it --rm --name subplz --gpus all -v /mnt/d/sync:/sync -v /mnt/d/SyncCache:/SyncCache subplz:latest sync -d "/sync/変な家/" --rerun   
🖥️  We're using cuda. Results will be faster using Cuda with GPU than just CPU. Lot's of RAM needed no matter what.
📝 Transcribing...
Could not load library libcudnn_ops_infer.so.8. Error: libcudnn_ops_infer.so.8: cannot open shared object file: No such file or directory

I had this error locally on the host as well, and had to add the LD_LIBRARY_PATH to my env var to get it working. I see the workaround above, but it also say's it's fixed. Is there any reason I still can't run faster-whisper commands?

Update: Ran this inside my docker container

>>> import os
>>> import nvidia.cublas.lib
>>> import nvidia.cudnn.lib
>>> 
>>> print(os.path.dirname(nvidia.cublas.lib.__file__) + ":" + os.path.dirname(nvidia.cudnn.lib.__file__))
/lsiopy/lib/python3.10/site-packages/nvidia/cublas/lib:/lsiopy/lib/python3.10/site-packages/nvidia/cudnn/lib

Then copied the output to my dockerfile to get things working. It's basically the same workaround as before. ENV LD_LIBRARY_PATH="/lsiopy/lib/python3.10/site-packages/nvidia/cublas/lib:/lsiopy/lib/python3.10/site-packages/nvidia/cudnn/lib"

github-actions[bot] commented 3 months ago

This issue is locked due to inactivity

linuxserver / docker-faster-whisper