cmeraki / indri

omnimodal models
Other
5 stars 1 forks source link

Indri

HuggingFace

Multimodal audio LMs for TTS, ASR, and voice cloning

Running locally

Prerequisites

Install dependencies:

pip install -r requirements.txt

Install ffmpeg:

For linux:

sudo apt update -y
sudo apt upgrade -y
sudo apt install ffmpeg -y

Running the service

python -m inference --model_path 11mlabs/indri-0.1-124m-tts --device cuda:0 --port 8000

Defaults:

Choices:

Redirect to http://localhost:8000/docs to see the API documentation and test the service.