faster-whisper-server
is an OpenAI API-compatible transcription server which uses faster-whisper as its backend.
Features:
Please create an issue if you find a bug, have a question, or a feature suggestion.
See OpenAI API reference for more information.
POST /v1/audio/transcriptions
endpoint.
faster-whisper-server
also supports streaming transcriptions(and translations). This is useful for when you want to process large audio files and would rather receive the transcription in chunks as they are processed rather than waiting for the whole file to be transcribed. It works similarly to chat messages when chatting with LLMs.POST /v1/audio/translations
endpoint.WS /v1/audio/transcriptions
endpoint.
Using Docker
docker run --gpus=all --publish 8000:8000 --volume ~/.cache/huggingface:/root/.cache/huggingface fedirz/faster-whisper-server:latest-cuda
# or
docker run --publish 8000:8000 --volume ~/.cache/huggingface:/root/.cache/huggingface fedirz/faster-whisper-server:latest-cpu
Using Docker Compose
curl -sO https://raw.githubusercontent.com/fedirz/faster-whisper-server/master/compose.yaml
docker compose up --detach faster-whisper-server-cuda
# or
docker compose up --detach faster-whisper-server-cpu
Using Kubernetes: tutorial
If you are looking for a step-by-step walkthrough, check out this YouTube video.
export OPENAI_API_KEY="cant-be-empty"
export OPENAI_BASE_URL=http://localhost:8000/v1/
openai api audio.transcriptions.create -m Systran/faster-distil-whisper-large-v3 -f audio.wav --response-format text
openai api audio.translations.create -m Systran/faster-distil-whisper-large-v3 -f audio.wav --response-format verbose_json
from openai import OpenAI
client = OpenAI(api_key="cant-be-empty", base_url="http://localhost:8000/v1/")
audio_file = open("audio.wav", "rb")
transcript = client.audio.transcriptions.create(
model="Systran/faster-distil-whisper-large-v3", file=audio_file
)
print(transcript.text)
# If `model` isn't specified, the default model is used
curl http://localhost:8000/v1/audio/transcriptions -F "file=@audio.wav"
curl http://localhost:8000/v1/audio/transcriptions -F "file=@audio.mp3"
curl http://localhost:8000/v1/audio/transcriptions -F "file=@audio.wav" -F "stream=true"
curl http://localhost:8000/v1/audio/transcriptions -F "file=@audio.wav" -F "model=Systran/faster-distil-whisper-large-v3"
# It's recommended that you always specify the language as that will reduce the transcription time
curl http://localhost:8000/v1/audio/transcriptions -F "file=@audio.wav" -F "language=en"
curl http://localhost:8000/v1/audio/translations -F "file=@audio.wav"
From live-audio example
https://github.com/fedirz/faster-whisper-server/assets/76551385/e334c124-af61-41d4-839c-874be150598f
websocat installation is required. Live transcribing audio data from a microphone.
ffmpeg -loglevel quiet -f alsa -i default -ac 1 -ar 16000 -f s16le - | websocat --binary ws://localhost:8000/v1/audio/transcriptions