-
Can I use this repo for training new tts model in another language?
How much hours of audio + transcripts do I need?
Does the text should have diacritical signs?
-
**Describe the bug**
Train with 100 epochs, take a snapshot and try the inference with pretrainined waveglow. Validation loss is 3.09584379196167 at the end. But all I get is noise with nothing rec…
-
I tried setting this up on a fresh linux install and found that the steps in the readme are incomplete to set this up from scratch. The steps mentioned are:
```
npm install
npm start
# source $VIR…
-
I've been trying to set up a speech model on an Xavier NX, and I've been able to get Tacotron2/Waveglow running, however the the size of the models uses quite a lot of memory. I've been looking to use…
-
1. document of [torch.nn.CTCLoss](https://pytorch.org/docs/stable/generated/torch.nn.CTCLoss.html#torch.nn.CTCLoss) says: the input is a **logarithmized probabilities**, obtained with `torch.nn.funct…
-
Dear @alancucki ,
How to properly calculate pitch mean, std, fmin and fmax given the pitch estimated in shape of [1xmel_frames]?
Yerzhan.
-
Hello!
I have a question about the adding position of an attention prior.
You added the attention prior before calculating forwardsum loss like this. https://github.com/imdanboy/jets/blob/44e3dbcb9e…
-
Related to **FastPitch1.1/pytorch**
**Describe the bug**
data_function.py: def estimate_pitch( ... ) is bad, it use librosa.pyin to estimate audio pitch, but without any custom paramters,…
-
Hi, is there any simple way to get the timestamps for each word utterance in the resultant audio file. It would be really helpful if anyone can suggest changes to make within the `inference.ipynb` fil…
-
As rafaelvalle mentioned here https://github.com/NVIDIA/tacotron2/issues/336#issuecomment-649724985 ; the dropout caused Tacotron model to "say the same phrase in multiple ways". In theory, this is a …