grammatek / simaromur

Icelandic TTS (text-to-speech) service for Android
Apache License 2.0
8 stars 2 forks source link

WIP: Feat/experimental vits #146

Closed lumpidu closed 7 months ago

lumpidu commented 8 months ago

This is a placeholder Pull Request, I will clean this up/amend appropriately, if we have the new voice is-steinn-xs.onnx based on our own phonemization.

Implement a preliminary runtime for VITS voice is-steinn-medium.onnx

The voice 'is-steinn-medium.onnx' uses phonemization based on the ancient eSpeak IPA dialect and was purely trained on eSpeak phonemizeation. As we are not using eSpeak inside Símarómur, try a naive approach in emulating the eSpeak IPA dialect and adapt the model inputs with the appropriate phoneme conversions, like padding every symbol with 0, adding BOS, EOS, etc.

The resulting voice performance is quite acceptable for demo purposes and also shows promising runtime performance.

lumpidu commented 7 months ago

Superseded via #151