Closed andreaalloway closed 4 months ago
Thanks for opening your first issue here! Be sure to follow the relevant issue templates, or risk having this issue marked as invalid.
As a work around it appears that I can do the following. Log into the container using
docker exec -it faster-whisper /bin/bash
Install torch
pip install torch --index-url https://download.pytorch.org/whl/cu121
exit the container bash
Create a .bashrc file under the /config directory (vim is not installed on the container so I used the host for this)
vim config/.bashrc
with the contents:
export LD_LIBRARY_PATH=`python3 -c 'import os; import nvidia.cublas.lib; import nvidia.cudnn.lib; import torch; print(os.path.dirname(nvidia.cublas.lib.__file__) + ":" + os.path.dirname(nvidia.cudnn.lib.__file__) + ":" + os.path.dirname(torch.__file__) +"/lib")'`:$LD_LIBRARY_PATH
Then restarted my container
docker restart faster-whisper
As a work around it appears that I can do the following. Log into the container using
docker exec -it faster-whisper /bin/bash
Install torch
pip install torch --index-url https://download.pytorch.org/whl/cu121
exit the container bash
Create a .bashrc file under the /config directory (vim is not installed on the container so I used the host for this)
vim config/.bashrc
with the contents:
export LD_LIBRARY_PATH=`python3 -c 'import os; import nvidia.cublas.lib; import nvidia.cudnn.lib; import torch; print(os.path.dirname(nvidia.cublas.lib.__file__) + ":" + os.path.dirname(nvidia.cudnn.lib.__file__) + ":" + os.path.dirname(torch.__file__) +"/lib")'`:$LD_LIBRARY_PATH
Then restarted my container
docker restart faster-whisper
This worked for me too. Thank you for the suggestion.
This issue has been automatically marked as stale because it has not had recent activity. This might be due to missing feedback from OP. It will be closed if no further activity occurs. Thank you for your contributions.
I had this same issue, I'm no expert but is this because the docker file is installing libs for cu11
rather than cu12
? (So it breaks for anyone using CUDA v12) Would be nice if this could get fixed to avoid the need for the workaround suggested above.
upstream project wants cu11 iirc
It looks like upstream has switched the default recommendation to CUDA 12 https://github.com/SYSTRAN/faster-whisper/commit/3d1de60ef3ce7d34f7c0ae6547f8a616aa060ac2, with the caveat that this may break some CUDA 11 setups, but I don't think we can win on that because the same version of ctranslate2 won't support both 11 and 12 and I don't really want a) A 5Gb+ image or b) two different branches for different versions.
Also looks like nvidia-cudnn-cu12 version 9+ has issues, so it's going to need pinning
Please try ghcr.io/linuxserver/lspipepr-faster-whisper:gpu-2.0.0-pkg-c801351f-dev-4db4a97b3e161472da9c546387db12b39d05a816-pr-16
and see if it resolves your issues.
Please try
ghcr.io/linuxserver/lspipepr-faster-whisper:gpu-2.0.0-pkg-c801351f-dev-4db4a97b3e161472da9c546387db12b39d05a816-pr-16
and see if it resolves your issues.
This version appears to be working without the .bashrc work around
PR has been merged, new image should be built in the next ~30 mins.
{
"Id": "5a7daaf35afd582695e2a7334f4df4300568706c021a94dc36b8570f43fa578b",
"Created": "2024-05-19T23:19:00.278825074Z",
"Path": "/init",
"Args": [],
"State": {
"Status": "running",
"Running": true,
"Paused": false,
"Restarting": false,
"OOMKilled": false,
"Dead": false,
"Pid": 31380,
"ExitCode": 0,
"Error": "",
"StartedAt": "2024-05-19T23:19:00.622553426Z",
"FinishedAt": "0001-01-01T00:00:00Z"
},
"Image": "sha256:d21f6ea99e039c4c747462439217435ae5dda8a05de3c9d36d9c3fdd9a77eadb",
"ResolvConfPath": "/var/lib/docker/containers/5a7daaf35afd582695e2a7334f4df4300568706c021a94dc36b8570f43fa578b/resolv.conf",
"HostnamePath": "/var/lib/docker/containers/5a7daaf35afd582695e2a7334f4df4300568706c021a94dc36b8570f43fa578b/hostname",
"HostsPath": "/var/lib/docker/containers/5a7daaf35afd582695e2a7334f4df4300568706c021a94dc36b8570f43fa578b/hosts",
"LogPath": "/var/lib/docker/containers/5a7daaf35afd582695e2a7334f4df4300568706c021a94dc36b8570f43fa578b/5a7daaf35afd582695e2a7334f4df4300568706c021a94dc36b8570f43fa578b-json.log",
"Name": "/faster-whisper",
"RestartCount": 0,
"Driver": "btrfs",
"Platform": "linux",
"MountLabel": "",
"ProcessLabel": "",
"AppArmorProfile": "",
"ExecIDs": null,
"HostConfig": {
"Binds": [
"/mnt/cache/appdata/faster-whisper:/config:rw"
],
"ContainerIDFile": "",
"LogConfig": {
"Type": "json-file",
"Config": {
"max-file": "1",
"max-size": "50m"
}
},
"NetworkMode": "br0.20",
"PortBindings": {},
"RestartPolicy": {
"Name": "no",
"MaximumRetryCount": 0
},
"AutoRemove": false,
"VolumeDriver": "",
"VolumesFrom": null,
"ConsoleSize": [
0,
0
],
"CapAdd": null,
"CapDrop": null,
"CgroupnsMode": "private",
"Dns": [
"10.0.20.1"
],
"DnsOptions": [],
"DnsSearch": [],
"ExtraHosts": null,
"GroupAdd": null,
"IpcMode": "private",
"Cgroup": "",
"Links": null,
"OomScoreAdj": 0,
"PidMode": "",
"Privileged": false,
"PublishAllPorts": false,
"ReadonlyRootfs": false,
"SecurityOpt": null,
"UTSMode": "",
"UsernsMode": "",
"ShmSize": 67108864,
**"Runtime": "nvidia",**
"Isolation": "",
"CpuShares": 0,
"Memory": 0,
"NanoCpus": 0,
"CgroupParent": "",
"BlkioWeight": 0,
"BlkioWeightDevice": [],
"BlkioDeviceReadBps": [],
"BlkioDeviceWriteBps": [],
"BlkioDeviceReadIOps": [],
"BlkioDeviceWriteIOps": [],
"CpuPeriod": 0,
"CpuQuota": 0,
"CpuRealtimePeriod": 0,
"CpuRealtimeRuntime": 0,
"CpusetCpus": "",
"CpusetMems": "",
"Devices": [],
"DeviceCgroupRules": null,
**"DeviceRequests": [
{
"Driver": "",
"Count": -1,
"DeviceIDs": null,
"Capabilities": [
[
"gpu"
]
],
"Options": {}
}
],**
"MemoryReservation": 0,
"MemorySwap": 0,
"MemorySwappiness": null,
"OomKillDisable": null,
"PidsLimit": null,
"Ulimits": null,
"CpuCount": 0,
"CpuPercent": 0,
"IOMaximumIOps": 0,
"IOMaximumBandwidth": 0,
"MaskedPaths": [
"/proc/asound",
"/proc/acpi",
"/proc/kcore",
"/proc/keys",
"/proc/latency_stats",
"/proc/timer_list",
"/proc/timer_stats",
"/proc/sched_debug",
"/proc/scsi",
"/sys/firmware",
"/sys/devices/virtual/powercap"
],
"ReadonlyPaths": [
"/proc/bus",
"/proc/fs",
"/proc/irq",
"/proc/sys",
"/proc/sysrq-trigger"
]
},
"GraphDriver": {
"Data": null,
"Name": "btrfs"
},
"Mounts": [
{
"Type": "bind",
"Source": "/mnt/cache/appdata/faster-whisper",
"Destination": "/config",
"Mode": "rw",
"RW": true,
"Propagation": "rprivate"
}
],
"Config": {
"Hostname": "5a7daaf35afd",
"Domainname": "",
"User": "",
"AttachStdin": false,
"AttachStdout": false,
"AttachStderr": false,
"ExposedPorts": {
"10300/tcp": {}
},
"Tty": false,
"OpenStdin": false,
"StdinOnce": false,
"Env": [
"PUID=99",
"UMASK=022",
"HOST_OS=Unraid",
"HOST_HOSTNAME=zuse",
"HOST_CONTAINERNAME=faster-whisper",
"TCP_PORT_10300=10300",
"WHISPER_MODEL=tiny-int8",
"PGID=100",
"TZ=America/Chicago",
"WHISPER_BEAM=1",
"WHISPER_LANG=en",
**"NVIDIA_DRIVER_CAPABILITIES'=gpu",**
**"NVIDIA_VISIBLE_DEVICES=GPU-4fcc04e7-23a5-2aa8-96e5-76facc3844bc",**
"PATH=/lsiopy/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
"HOME=/config",
"LANGUAGE=en_US.UTF-8",
"LANG=en_US.UTF-8",
"TERM=xterm",
"S6_CMD_WAIT_FOR_SERVICES_MAXTIME=0",
"S6_VERBOSITY=1",
"S6_STAGE2_HOOK=/docker-mods",
"VIRTUAL_ENV=/lsiopy",
"LSIO_FIRST_PARTY=true"
],
"Cmd": null,
"Image": "ghcr.io/linuxserver/lspipepr-faster-whisper:gpu-2.0.0-pkg-c801351f-dev-4db4a97b3e161472da9c546387db12b39d05a816-pr-16",
"Volumes": {
"/config": {}
},
"WorkingDir": "/",
"Entrypoint": [
"/init"
],
"OnBuild": null,
"Labels": {
"build_version": "Linuxserver.io version:- 2.0.0-pkg-c801351f-dev-4db4a97b3e161472da9c546387db12b39d05a816-pr-16 Build-date:- 2024-05-19T15:21:39+00:00",
"maintainer": "thespad",
"net.unraid.docker.icon": "https://raw.githubusercontent.com/linuxserver/docker-templates/master/linuxserver.io/img/linuxserver-ls-logo.png",
"net.unraid.docker.managed": "dockerman",
"org.opencontainers.image.authors": "linuxserver.io",
"org.opencontainers.image.created": "2024-05-19T15:21:39+00:00",
"org.opencontainers.image.description": "[Faster-whisper](https://github.com/SYSTRAN/faster-whisper) is a reimplementation of OpenAI's Whisper model using CTranslate2, which is a fast inference engine for Transformer models. This container provides a Wyoming protocol server for faster-whisper.",
"org.opencontainers.image.documentation": "https://docs.linuxserver.io/images/docker-faster-whisper",
"org.opencontainers.image.licenses": "GPL-3.0-only",
"org.opencontainers.image.ref.name": "4db4a97b3e161472da9c546387db12b39d05a816",
"org.opencontainers.image.revision": "4db4a97b3e161472da9c546387db12b39d05a816",
"org.opencontainers.image.source": "https://github.com/linuxserver/docker-faster-whisper",
"org.opencontainers.image.title": "Faster-whisper",
"org.opencontainers.image.url": "https://github.com/linuxserver/docker-faster-whisper/packages",
"org.opencontainers.image.vendor": "linuxserver.io",
"org.opencontainers.image.version": "2.0.0-ls18",
"swag": "enable",
"swag_port": "10300",
"swag_url": "fw.theoswalds.com"
}
},
"NetworkSettings": {
"Bridge": "",
"SandboxID": "4a8532f3a95ce7242da0cb9c396165aeb2078acd1a812163577270cb16bb172c",
"HairpinMode": false,
"LinkLocalIPv6Address": "",
"LinkLocalIPv6PrefixLen": 0,
"Ports": {},
"SandboxKey": "/var/run/docker/netns/4a8532f3a95c",
"SecondaryIPAddresses": null,
"SecondaryIPv6Addresses": null,
"EndpointID": "",
"Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"IPAddress": "",
"IPPrefixLen": 0,
"IPv6Gateway": "",
"MacAddress": "",
"Networks": {
"br0.20": {
"IPAMConfig": null,
"Links": null,
"Aliases": [
"5a7daaf35afd"
],
"NetworkID": "083fbe85ffc8034005057ad6b3b67ba621dc6c0601e0afbad3dd37bebc35bc6e",
"EndpointID": "db5e2a20d843e010245619196d74af94db24ef91b28c529734df2cefcfbb8635",
"Gateway": "10.0.20.1",
"IPAddress": "10.0.20.27",
"IPPrefixLen": 24,
"IPv6Gateway": "",
"GlobalIPv6Address": "",
"GlobalIPv6PrefixLen": 0,
"MacAddress": "02:42:0a:00:14:1b",
"DriverOpts": null
}
}
}
}
]
Error
INFO:faster_whisper:Processing audio with duration 00:01.260
ERROR:asyncio:Task exception was never retrieved
future: <Task finished name='wyoming event handler' coro=<AsyncEventHandler.run() done, defined at /lsiopy/lib/python3.10/site-packages/wyoming/server.py:28> exception=RuntimeError('cuBLAS failed with status CUBLAS_STATUS_ALLOC_FAILED')>
Traceback (most recent call last):
File "/lsiopy/lib/python3.10/site-packages/wyoming/server.py", line 35, in run
if not (await self.handle_event(event)):
File "/lsiopy/lib/python3.10/site-packages/wyoming_faster_whisper/handler.py", line 70, in handle_event
text = " ".join(segment.text for segment in segments)
File "/lsiopy/lib/python3.10/site-packages/wyoming_faster_whisper/handler.py", line 70, in <genexpr>
text = " ".join(segment.text for segment in segments)
File "/lsiopy/lib/python3.10/site-packages/faster_whisper/transcribe.py", line 511, in generate_segments
encoder_output = self.encode(segment)
File "/lsiopy/lib/python3.10/site-packages/faster_whisper/transcribe.py", line 762, in encode
return self.model.encode(features, to_cpu=to_cpu)
RuntimeError: cuBLAS failed with status CUBLAS_STATUS_ALLOC_FAILED
I'm running into an issue with the latest version and ghcr.io/linuxserver/lspipepr-faster-whisper:gpu-2.0.0-pkg-c801351f-dev-4db4a97b3e161472da9c546387db12b39d05a816-pr-16
as well. I put all of the nvidia flags in bold that I'm using. Any thoughts?
I'm also getting this error with the tool i'm developing using this image.
docker run -it --rm --name subplz --gpus all -v /mnt/d/sync:/sync -v /mnt/d/SyncCache:/SyncCache subplz:latest sync -d "/sync/変な家/" --rerun
🖥️ We're using cuda. Results will be faster using Cuda with GPU than just CPU. Lot's of RAM needed no matter what.
📝 Transcribing...
Could not load library libcudnn_ops_infer.so.8. Error: libcudnn_ops_infer.so.8: cannot open shared object file: No such file or directory
I had this error locally on the host as well, and had to add the LD_LIBRARY_PATH to my env var to get it working. I see the workaround above, but it also say's it's fixed. Is there any reason I still can't run faster-whisper commands?
Update: Ran this inside my docker container
>>> import os
>>> import nvidia.cublas.lib
>>> import nvidia.cudnn.lib
>>>
>>> print(os.path.dirname(nvidia.cublas.lib.__file__) + ":" + os.path.dirname(nvidia.cudnn.lib.__file__))
/lsiopy/lib/python3.10/site-packages/nvidia/cublas/lib:/lsiopy/lib/python3.10/site-packages/nvidia/cudnn/lib
Then copied the output to my dockerfile to get things working. It's basically the same workaround as before.
ENV LD_LIBRARY_PATH="/lsiopy/lib/python3.10/site-packages/nvidia/cublas/lib:/lsiopy/lib/python3.10/site-packages/nvidia/cudnn/lib"
This issue is locked due to inactivity
Is there an existing issue for this?
Current Behavior
I'm using the lscr.io/linuxserver/faster-whisper:gpu and I'm encountering issues where any Wyoming prompt results in the following error:
It appears to be related to this behavior in faster-whisper https://github.com/SYSTRAN/faster-whisper/issues/516
Expected Behavior
faster-whisper is able to use the GPU to parse speech to text
Steps To Reproduce
Setup the faster-whisper docker container per below Added faster-whisper to Home Assistant using the Wyoming protocol Setup a Raspberry PI 3+ with wyoming-satellite per https://github.com/rhasspy/wyoming-satellite/blob/master/docs/tutorial_installer.md Prompts are responded (local wyoming-wakeword.service) to but in the logs on the docker container indicate an error
Logs for docker container lscr.io/linuxserver/faster-whisper:gpu
Logs for wyoming-satellite.service
Environment