Macoron / piper.unity

Running text to speech model (Piper) in Unity3d on your local machine.
GNU General Public License v3.0
25 stars 4 forks source link

piper.unity

This is the Unity3d port of Piper text-to-speech (TTS) library. It provides fast, local, and high-quality speech generation for multiple languages. Inference is done by using Unity Sentis.

Supports only Windows x86-64.

Samples

https://github.com/Macoron/piper.unity/assets/6161335/7ab818f0-acd6-46cf-ab5b-09c847aca2dc

"en_US-lessac-medium" model tested on English text

https://github.com/Macoron/piper.unity/assets/6161335/fd6a5826-23e7-4dca-acdd-00469d71c882

"ru_RU-ruslan-medium" model tested on Russian text

Getting started

  1. Clone this repository and open it as a regular Unity project.
  2. Download a .onnx model file, such as en_US-lessac-medium.onnx
  3. Place it somewhere inside your Assets folder. Sentis should automatically transform it into model asset
  4. Open Assets/Piper/Samples/PiperSample scene in Unity Editor
  5. Find Piper GameObject and set PiperManager.Model field to previously downloaded model
  6. Press Play and test speech generation

You can find more piper models here. Each model comes with a model card describing the training dataset and license (for example en_US-lessac-medium model card).

You would also need to set correct the "voice" (language code) and sample rate for the model. This can be find in json located near the model, like in en_US-lessac-medium.onnx.json.

License

piper.unity is released under the GPLv3 license.

It uses eSpeak NG compiled libraries and data which are under GPLv3 license.

It also uses Piper Phonemization fork compiled library which is under MIT license.

Models aren't included in this repository. Please contact the original model's creators to learn more about their licenses.