ministryofjustice / analytical-platform

Analytical Platform • This repository is defined and managed in Terraform
https://docs.analytical-platform.service.justice.gov.uk
MIT License
12 stars 4 forks source link

✨ Install ffmpeg to enable speech-to-text functionality #6131

Open jtattersall09403 opened 4 days ago

jtattersall09403 commented 4 days ago

Describe the feature request.

I would like to be able to use the openai whisper model in python for transcribing audio to text, in VS code on the AP (https://huggingface.co/openai/whisper-large-v3-turbo). This requires ffmpeg (https://www.ffmpeg.org/), which requires root access for installation on linux.

Describe the context.

My team are about to start work on a project to transcribe and translate audio. The current state of the art in this space is the OpenAI whisper model. Unlike their other models (the GPT family), Whisper can be downloaded and run locally. It will be critical for our project to be able to test whisper for our use case. In order to be able to do this, we need to be able to run it on the AP. And to do that, we need ffmpeg :)

Value / Purpose

Enable use of OpenAI whisper as a critical bottleneck for a prisons data science project.

User Types

Data Scientists in Prisons Data Science.