Aculeasis / rhvoice-proxy

High-level interface for RHVoice library
GNU General Public License v3.0
9 stars 3 forks source link
python3 rhvoice shared-library text-to-speech tts

High-level interface for RHVoice library

API PyPI version Python versions PyPI - Format Build

Generating speech stream from text via RHVoice library without re-initializing engine. This very fast and more convenient than call RHVoice-test.

Supported audio formats: wav, mp3, opus, flac and pcm (raw RHVoice output).

Install

pip3 install rhvoice-wrapper

This package does NOT provide RHVoice itself. You must be build (or install) RHVoice, languages and voices manually. In Windows you must specify the paths for work.

rhvoice-wrapper-bin

Warning! rhvoice-wrapper-bin not working in macOS, install RHVoice manually.

Instead of RHVoice you may install rhvoice-wrapper-bin. This is best way for Windows. If the rhvoice-wrapper-bin is installed, its libraries and data will be used automatically.

pip3 install rhvoice-wrapper[rhvoice]

Documentation

First create TTS object:

from rhvoice_wrapper import TTS

tts = TTS(threads=1)

You may set options when creating or through variable environments (UPPER REGISTER). Options override variable environments. To set the default value use None:

Usage

Start synthesis generator and get audio data, chunk by chunk:

def generator_audio(text, voice='anna', format_='wav', buff=4096, sets=None):
    with tts.say(text, voice, format_, buff, sets) as gen:
        for chunk in gen:
            yield chunk

Or get all audio data in one big chunk:

data = tts.get('Hello world!', format_='wav')
print('data size: ', len(data), ' bytes')
subprocess.check_output(['aplay', '-q'], input=data)

Or just save to file:

tts.to_file(filename='esperanto.ogg', text='Saluton mondo', voice='spomenka', format_='opus', sets=None)

format_ is output audio format. Must be present in tts.formats.

voice is a voice of speaker. Must be present in tts.voice_profiles. voice='Voice', sets=None equal voice=None, sets={'voice_profile': 'Voice'}, voice more priority.

sets may set as dict containing synthesis parameters as in set_params. This parameters only work for current phrase. Default None.

If buff equal None or 0, for pcm and wav chunks return as is (probably little faster). For others used default chunk size (4 KiB).

Text as iterable object

If text iterable object, all its fragments will processing successively. This is a good method for processing incredibly large texts. Remember, the generator cannot be transferred to another process. Example:

def _text():
    with open('wery_large_book.txt') as fp:
        text = fp.read(5000)
        while text:
            yield text
            text = fp.read(5000)

def generator_audio():
    with tts.say(_text()) as gen:
        for chunk in gen:
            yield chunk

Other methods

set_params

Changes voice synthesizer settings:

tts.set_params(**kwargs)

Allow: voice_profile, absolute_rate, absolute_pitch, absolute_volume, relative_rate, relative_pitch, relative_volume, punctuation_mode, punctuation_list, capitals_mode, flags. See RHVoice documentation for details.

Return True if change, else False.

get_params

Get voice synthesizer settings:

tts.get_params(param=None)

If param is None return all settings in dict, else parameter value by name. If parameter not found return None.

join

Join thread or processes. Don't use object after join:

tts.join()

Properties

Examples

Requirements