-
Is it possible to achieve a lossless decomposition of a .wav signal using WORLD? If so, what parameters/options should I be using?
Basically, I'm hoping that my .wav signal (x), after decomposing …
-
When we use pre-trained HiFi vocoder on FastSpeech2, Fastpitch synthesized mel features then vocoded audio contains artifacts (noise). After fine tuning the audio quality significantly get improved. …
-
Is there any way to currently inference the model and create an output?
-
Hello, I have a question related to pretrained TTS models for fr-FR, es-ES and it-IT languages. There are only demos for English, Japanese and Mandarin languages and usage of G2P isn't clear enough. W…
-
**Describe the bug**
I installed TTS 0.4.1 and ran
tts --model_name tts_models/en/vctk/fast_pitch --text "hello world." --speaker_idx VCTK_p225
It uses vocoder_models/en/vctk/hifigan_v2 by default…
-
I see that in `baker_preprocess.yaml` the hop size is 300, but in `tacotron2.baker.v1.yaml` the hop size is 256. My understanding is the hop size in `preprocess` should be adjusted according to sample…
-
Hi, thank you very much for your valuable SVS corpus and code.
I strictly follow your instruction until step "2. Training Example" for SVS, in https://github.com/MoonInTheRiver/DiffSinger . Then I …
-
Using out of the box training produces results that are not forming coherent words.
Initially running prepare_libri.ipnb with 20 speakers, then running as MFA instructed, I encountered size mismatche…
-
How to modify fundamental frequency (F0), spectral envelope and aperiodicity parameters to convert male voice to female and vice versa?
-
What type of loss_total number, etc, should I be looking for to verify that things seem to be training correctly?
I'm currently at step: 3.792k, 3 hours 12 minutes, total loss 0.2972