Open leoguillaume opened 2 months ago
You're mistaken, you're not inputing the modelname as Systran is just the name of the repo of the guy how makes faster whisper.
I have this in my docker compose:
- PRELOAD_MODELS=["large-v3"]
It's probably worth adding an example in the yaml file
I've just rebuilt the image directly from the repository, it works perfectly, there must be a difference between the main branch and the cuda-latest tag.
For example with [“large-v3”] and with the locally built fresh image : The error is classic huggingface since large-v3 is not a known model id on HF.
With the same image but with ["Systran/faster-whisper-large-v3", "Systran/faster-distil-whisper-large-v3"]:
That's work :) can you push a image with de latest code version maybe ?
I'm not the owner of this repo so I'll leave that up to them :)
I'm experiencing the same issue. Have you been able to find a solution for it?
environment:
- PRELOAD_MODELS=["Systran/faster-whisper-medium"]
works for me
environment: - PRELOAD_MODELS=["Systran/faster-whisper-medium"]
works for me
Which image tag you use ?
services:
faster-whisper-server-cuda:
image: fedirz/faster-whisper-server:latest-cuda
build:
dockerfile: Dockerfile.cuda
context: .
platforms:
- linux/amd64
- linux/arm64
restart: unless-stopped
ports:
- 8000:8000
environment:
- PRELOAD_MODELS=["Systran/faster-whisper-medium"]
volumes:
- hugging_face_cache:/root/.cache/huggingface
develop:
watch:
- path: faster_whisper_server
action: rebuild
deploy:
resources:
reservations:
devices:
- driver: nvidia
device_ids: ['1']
capabilities: ["gpu"]
volumes:
hugging_face_cache:
Well I'm definitely encountering this issue now. It happened when I switched to large v3 but might have nothing to do with that since reusing my previous config does not seem to preload either.
So it seems to have broke recently.
Here's my compose content where I added comments.
faster-whisper-server-cuda:
image: fedirz/faster-whisper-server:latest-cuda
build:
dockerfile: Dockerfile.cuda
context: .
platforms:
- linux/amd64
volumes:
- /home/root/.cache/huggingface:/root/.cache/huggingface
restart: unless-stopped
ports:
- 8001:8001
environment:
- UVICORN_PORT=8001
- ENABLE_UI=false
- MIN_DURATION=1
# default TTL is 300 (5min), -1 to disable, 0 to unload directly, 43200=12h
- WHISPER__TTL=43200
- WHISPER__INFERENCE_DEVICE=cuda
- WHISPER__COMPUTE_TYPE=int8
- WHISPER__MODEL=deepdml/faster-whisper-large-v3-turbo-ct2 # works (finds the right model)
- PRELOAD_MODELS=["deepdml/faster-whisper-large-v3-turbo-ct2"] # doesn't work (no preloading)
# - PRELOAD_MODELS=["faster-whisper-large-v3-turbo-ct2"] # doesn't work either
# Used to work but not anymore
# - WHISPER__MODEL=large-v3
# - PRELOAD_MODELS=["large-v3"]
develop:
watch:
- path: faster_whisper_server
action: rebuild
deploy:
resources:
reservations:
devices:
- capabilities: ["gpu"]
network_mode: host
pull_policy: always
(Very sorry for bothering you @fedirz but because this issue was closed in the past I'm afraid you would miss it when catching up so I'm humbly notifying you and asking to reopen this issue just in case, but of course do what you want and keep it closed if that's how you work :))
- WHISPER__INFERENCE_DEVICE=cuda
Same for me, preloading models doesnt work which is not that big of a deal but still would make the transcribing faster...
With local deployment, the PRELOAD_MODELS config variable works perfectly :
But in a docker compose that not :
The docker compose :
I tried different types of quotes :
The models are not download on my volume or anywhere else. Any ideas ? Thanks in advance