ambuda-org / ambuda

Main application code for Ambuda, a breakthrough Sanskrit library (ambuda.org)
https://ambuda.org
MIT License
91 stars 24 forks source link

Add UI for text to speech (TTS) #140

Open vvasuki opened 2 years ago

vvasuki commented 2 years ago

cc @avinashvarna We have access to pretty good TTS facility ( https://www.ragavera.com/tts/sg-kan-samples ) for offline non-commercial sanskrit use, using which we can generate a pretty high quality mp3 for each shloka or sentence. Such mp3-s can be properly numbered and stored (eg. on archive.org). It would be a good idea to have some facility to playback.

Particularly, something like https://avinashvarna.github.io/audio_alignment/corpus/ramayana/1.001/ would be wonderful. I am fond of the "repeat each shloka twice/ thrice before proceeding to the next" mode.

epicfaace commented 2 years ago

Do we also want to support existing (non-TTS) audio? For example, https://avinashvarna.github.io/audio_alignment/corpus/ramayana/1.001/ seems to have real human voice audio (not TTS).

vvasuki commented 2 years ago

Do we also want to support existing (non-TTS) audio? For example, https://avinashvarna.github.io/audio_alignment/corpus/ramayana/1.001/ seems to have real human voice audio (not TTS).

ताः critical-संस्करणस्य सञ्चिका न सन्ति। In some cases (kAlidAsa-s verses), closely matching human audio is available - albeit it would need to be split shlokawise.

akprasad commented 2 years ago

Yes, this would be a wonderful addition! Once we have support for translations and commentaries, we can get a better sense of the approach here.

The simplest approach is to generate all files and their per-block alignments offline then store the audio segments on a cloud file system.