-
Hi, I read about your multi_speaker implemention of Tacotron2. It means different speakers correspond to different text inputs, and you did not use the speaker embedding. Am i right ? If so, the speak…
-
I have done a lot of training on different self-made datasets (typically having around 3 hours of audio across a few thousand .wav files, all 22050 Hz) using Tacotron, starting from a pretrained LJSpe…
-
RuntimeError: Error(s) in loading state_dict for Tacotron:
size mismatch for encoder.embedding.weight: copying a param with shape torch.Size([70, 512]) from checkpoint, the shape in current m…
-
Thanks for your great work, but I found that if I set the hyperparameter `use_gst=False` and run, it seemed different from my understanding of Tacotron1. The tacotron.py code is part of here.
```pyth…
-
Hi,
Can you share the alignment graphs that you are obtaining for your audio samples? For most of my alignments, the y-axis is about half of the x-axis. Is there a reason why this is happening? In …
-
执行命令
python synthesizer_train.py mandarin {自己的路径}\SV2TTS\synthesizer
当
_characters = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz1234567890!\'(),-.:;? '
报错
File "synthesizer_train.py", l…
-
Thanks for your nice work. I have trained the model on Blizzard 2013 dataset. The synthesized files from 185k and 385k checkpoints are available in the following link. I used the samples from LJ-Speec…
-
When doing tests, I found each time I ran the synthesize.py(with the same text and reference audio), I got different results(namely different syntheized wavs). After looking up the code, I didn't find…
-
I have been working on TTS for several months now, and my (20+ hour) dataset is driving me crazy. At training time, keithito/tacotron and Rayhane-mamah/Tacotron2 are able to align fine, but when I sw…
-
Hi @syang1993 ,I tried run a 170K+ train test and the train output is perfect among the many tacotron implementation
But I notice that with out-of-collection data the output aligned is always mas…