CharlesAverill / ttsdg

A package to automate TTS data generation
GNU General Public License v3.0
0 stars 0 forks source link

TTSDG

TTSDG, or Text-To-Speech Data Generator, automates the simple-but-frustrating task of generating large amounts of TTS data for tasks like machine learning. TTSDG contains an easy-to-use class that can generate text offline, in large batches, and with control over the system voices that you have installed. TTSDG randomizes volume, speed, and voice of each sample, and prevents duplicates from happening.

TTSDG utilizes pyttsx3 and pydub to generate the audio and convert it into multiple formats. All pydub-supported formats are supported in TTSDG, like WAV, MP3, and AIFF.

Installation

TTSDG is available through pip:

python3 -m pip install ttsdg

Usage

from ttsdg import TTSDG

for word in ["Apple", "Orange", "Banana"]:
    print(word)

    gen = TTSDG(verbose=True)
    gen.volume_range = [.5, 1.0]
    gen.wpm_range = [200, 300]

    gen.generate(word, 100, out_format="wav")

    # A bug in pyttsx3 will cause hangs on generation
    # in loops sometimes. del the generator at the end 
    # of the loop to solve this
    del gen

Methods