This is an audiobook creator using Tortoise TTS
With this repo you will be able to generate super high quality audio books using ai models locally on your computer for absolutely FREE. (No internet connection is needed)
Sample Audio
Click below to listen to the audio sample:
This repo is a fork of the tortoise-fast repo: https://github.com/152334H/tortoise-tts-fast.git
which was created from the repo: https://github.com/neonbjb/tortoise-tts.git
BIG THANKS TO THE ORIGINAL CREATOR OF TORTOISE AND THE CREATOR OF TORTOISE FAST!!!
I changed quite a few things:
- How text is split into chunks
- Added the ability to add pauses to generation. Single or double line breaks create short pause and more than 2 lines breaks a long pause that can be configured from streamlit gui
- Changed the UI quite a bit to optimize it for audio book creation
- Ability to load in a file
- Added self correction feature that tries to fix issues with generation automatically (word, char differences, and pitch control)
- Finetuned tortoise settings
- Change terminal logs to show generation details more clearly
- Saves each generated file as mp3 and preserves them upon new generation
- Changed the way as audio files were saved avoiding any quality loss
- Save/reset settings
Hardware used for testing
Nvidia RTX 3090 (with Cuda 11.7)
Nvidia RTX 4090 (with Cuda 11.8)
Installation
I only tested it on Ubuntu 22.04 Linux.
Here are the steps:
- Install latest proprietary nvidia driver
- Install Ubuntu packages
sudo apt install git git-lfs perl make ffmpeg nvidia-cuda-toolkit nvidia-cudnn libportaudio2
- Download Miniconda from: https://docs.conda.io/projects/miniconda/en/latest/miniconda-install.html
- Install it without sudo rights for current user
- Restart computer
- Clone this repo
git clone https://github.com/georgecsaszargit/tortoise_audio_book_creator.git
- CD into the repo folder where you can see the requirements-new.txt
- Create conda env:
conda env create -f environment-new.yml
- Activate conda:
conda activate tortoiseaudiobook
- Install python packages using pip:
python -m pip install -r requirements-rtx3090.txt
- Install tortoise module:
python -m pip install -e .
- ONLY ON RTX4090 do this 1 following line:
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
- Download finetuned models and place them to ~/.cache/tortoise/models/ folder from: https://huggingface.co/csdzs/tortoise-audiobook-creator-finetuned-models(These models are better than the original tortoise models)
git clone https://huggingface.co/csdzs/tortoise-audiobook-creator-finetuned-models
cd tortoise-audiobook-creator-finetuned-models
git lfs fetch --all
git lfs checkout
mkdir -p ~/.cache/tortoise/models
cp * ~/.cache/tortoise/models
- cd 1 level up and run tortoise:
cd ..
streamlit run scripts/app.py
Instructional video: https://youtu.be/BCCMB0p4fC8?si=5pHqHb8nZCSa_ExO