Open kowshik24 opened 3 weeks ago
I use the pre-built Docker images from this repo: https://github.com/jim60105/docker-whisperX
@randyburden thanks for sharing I got the same repo. Do you have the docker hub repo for that?
@kowshik24, no, I don't use Docker Hub. I use the code below to pull in the Docker WhisperX image from the GitHub Container Registry, then create a new customized Docker image that preloads and caches the Pyannote models for offline use, and then upload that Docker image to Azure Container Services.
# Define optional arguments that indicate the OpenAI Whisper model size and language to use
ARG WHISPER_MODEL=medium
ARG LANG=en
# Get the base WhisperX Docker image (https://github.com/jim60105/docker-whisperX)
FROM ghcr.io/jim60105/whisperx:${WHISPER_MODEL}-${LANG}
# Define the required argument for the huggingface.co token used by Pyannote (diarization/speaker-recognition library)
ARG HUGGING_FACE_TOKEN
# Output argument value for debugging/inspecting
RUN echo "Huggingface.co token: ${HUGGING_FACE_TOKEN}"
# Ensure the required argument was supplied
# (test -n "") Returns false if the string is zero length
RUN test -n "$HUGGING_FACE_TOKEN" || (echo "HUGGING_FACE_TOKEN argument is required" && false)
# Preload and cache the Pyannote models so that the image can run offline
RUN python3 -c 'from pyannote.audio import Pipeline; pipeline = Pipeline.from_pretrained("pyannote/speaker-diarization-3.1", use_auth_token="'${HUGGING_FACE_TOKEN}'")'
I faced many issues while building the dockerfile for transcription and Speaker Diarization. Is there any git-repo available for that? Or are you planning to create a docker file specifically for runpod serverless.