Multimodal audio LMs for TTS, ASR, and voice cloning
Install dependencies:
pip install -r requirements.txt
Install ffmpeg:
For linux:
sudo apt update -y
sudo apt upgrade -y
sudo apt install ffmpeg -y
python -m inference --model_path 11mlabs/indri-0.1-124m-tts --device cuda:0 --port 8000
Defaults:
device
: cuda:0
port
: 8000
Choices:
model_path
: HuggingFace collectionRedirect to http://localhost:8000/docs
to see the API documentation and test the service.