MahmoudAshraf97 / whisper-diarization

Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
BSD 2-Clause "Simplified" License
3.7k stars 326 forks source link

Docker build #122

Open meonkeys opened 1 year ago

meonkeys commented 1 year ago

Just thought it would be handy to have a Docker image for this tool. I've been unable to get it working so far but I'll keep trying. If anyone else has it running in Docker, please share.

meonkeys commented 1 year ago

I got an image built. It's not clean enough for a pull request but I'll share what I've got anyway. Maybe someone else can pick this up and contribute it (assuming the maintainers want it).

I'm just creating a Dockerfile in a working copy (local clone) of this repository (HEAD at 2bdffc6b6e6e0d9ee8632dabf5009e995b31028d) and building with Docker. Here's the Dockerfile:

# FIXME: Makes a huge image.
# TODO: Optimize with a multi-stage build, perhaps also using venv.

# Pin to 3.10-bookworm to get Python 3.10
# because https://github.com/MahmoudAshraf97/whisper-diarization/issues/90
FROM python:3.10-bookworm

ARG WD_USER=joe
ARG WD_UID=1000
ARG WD_GROUP=joe
ARG WD_GID=1000

# We rarely see a full upgrade in a Dockerfile. Why?
# && apt-get --assume-yes dist-upgrade \
RUN apt-get update \
  && apt-get --assume-yes --no-install-recommends install \
  cython3 \
  ffmpeg \
  unzip \
  wget \
  && rm -rf /var/lib/apt/lists/*

WORKDIR /usr/src/app

COPY . .

RUN addgroup --gid $WD_GID $WD_GROUP \
  && adduser --uid $WD_UID --gid $WD_GID --shell /bin/bash --no-create-home $WD_USER \
  && chown -R $WD_USER:$WD_GROUP /usr/src/app

USER $WD_USER:$WD_GROUP

RUN mkdir venv \
  && python -m venv venv \
  && . venv/bin/activate \
  && pip install Cython \
  && pip install --no-cache-dir --requirement requirements.txt

Build with docker build --tag whisper-diarization . The rest assumes a Bash shell on Linux or something close to / compatible with that.

As user joe with UID 1000 and GID 1000, run with, for example:

BASE=$HOME/whisper-diarization
mkdir -p $BASE/data
mkdir -p $BASE/HOME_CACHE
mkdir -p $BASE/HOME_CONFIG
APP=/usr/src/app
mv /tmp/recording.mp3 data/
docker run --rm -it \
  -v $BASE/data:/data \
  -v $BASE/HOME_CONFIG:$APP/.config \
  -v $BASE/HOME_CACHE:$APP/.cache \
  --user joe:joe \
  whisper-diarization \
  bash

Now you're in the container at a non-root shell prompt, presumably. Run:

export HOME=/usr/src/app
source venv/bin/activate
python diarize_parallel.py -a /data/recording.mp3
exit

Now, inspect and manually clean up $BASE/data/recording.txt on the host.

cvette commented 1 year ago

Don't forget the --gpus all for docker run (if you want to use your GPU).

transcriptionstream commented 1 year ago

Just released "transcription stream" on GitHub today, which includes a docker image that runs diarize.py. Takes me about 15 minutes to build, but works great and is fast/automated. Would love to get your thoughts: https://github.com/transcriptionstream/transcriptionstream

occult commented 6 months ago

It took me 30 minutes to build it and the 7.5GB size, but it works. Thanks for sharing :)