-
### Description
The goal is to develop a Tibetan text-to-speech (TTS) model that can convert Tibetan text into Tibetan speech. This project involves training a TTS model using filtered good audio qual…
-
(tf-gpu) [pranaw@login expressive_tacotron-master]$ python train.py
Traceback (most recent call last):
File "train.py", line 101, in
g = Graph(); print("Training Graph loaded")
File "tra…
-
Hi,
following DeepMind's paper "Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron", they claim to use GMM attention as it generalize better for longer utterances. I …
-
Hello,
First, I apologize if this is not a proper channel to ask about your paper "MELLOTRON: MULTISPEAKER EXPRESSIVE VOICE SYNTHESIS BY CONDITIONING ON RHYTHM, PITCH AND GLOBAL STYLE TOKENS Rafael …
-
Hi !
Thanks for this great implementation !
I'm a speech scientist, and I'm not an expert in neural networks. I'm working on expressive speech synthesis, and I'm currently wondering about the po…
-
**Problem statement**
Previous TTS models often produced robotic-sounding speech, mispronounced words, lacked emotional nuance, struggled with contextual understanding, offered limited language suppo…
-
**Problem statement**
Previous TTS models often produced robotic-sounding speech, mispronounced words, lacked emotional nuance, struggled with contextual understanding, offered limited language suppo…
-
### advice:
1. **302 url**:some content
1. **use plugin**:can use [Ingest Attachment Plugin](https://www.elastic.co/guide/en/elasticsearch/plugins/current/ingest-attachment.html)
for exam…
-
I have printed the value of mel spectrogram before [normalization](https://github.com/Rayhane-mamah/Tacotron-2/blob/master/datasets/audio.py#L195) and I have found that the flooring and ceiling of tho…
-
What kinds of algorithms have you used to segment such long audios? The forced aligner could have some limitation to segment a long audio at once.