Closed sciai-ai closed 3 years ago
Hi @sciai-ai ,
thanks for raporting this! I've just checked both notebooks and the paths to checkpoints need to be updated:
fastp = '../pretrained_models/nvidia_fastpitch_200518.pt'
waveg = '../pretrained_models/nvidia_waveglow256pyt_fp16.pt'
Other than that, the code works well.
Could you please check again with those checkpoints? If you're getting noise, chances are that an unconverged model is being loaded instead.
Thanks @alancucki I managed to get it working with the pretrained models.
I also trained a tacotron model followed by the fastpitch model on my own dataset (>50 hrs). The speech with both tacotron and fastpitch (1500 epochs each) is audible however the several words in the sentence have poor quality, any ideas on how can I improve this?
On a side note that I used espnet on the same custom dataset before and their tacotron model gives much better results. However I want to add customiztions offered by fastpitch :)
Great to hear that!
What kind of quality issues do you experience? Slurred speech points to poor alignments, mispronunciation to poor grapheme generalization, and small artifacts might come from WaveGlow.
We're about to update FastPitch with an aligning mechanism that relieves it from using Tacotron 2, and delivers better quality. Stay tuned!
I think it's the case of poor alignments.
Any ideas when can we expect the new feature. Also would phoneme based training be possible too?
Yes, we plan to support both phonemes and graphemes. This should be on-line by the end of June.
that's great @alancucki. Have you checked the inference performance for longer texts, such as, paragraphs with greater than 3K characters? and whether cpu inference is supported?
CPU inference is supported by inference.py
(just skip the --cuda
flag).
For longer paragraphs, it's better to split them by sentence -- training data is limited by duration due to mem constraints, and the model is unlikely to generalize.
Yes, we plan to support both phonemes and graphemes. This should be on-line by the end of June.
Hello, do you have any eta for the update? I'm eager and very excited to take this for a spin when it's ready! The built-in alignment mechanism sounds amazing.
Hey @DanRuta , the model and the recipe for LJ is ready - nowe we're just updating performance measurements. Stay tuned :)
@alancucki did you ever get around to getting phones and graphemes work done? if so when can we expect this to come out? and it will be considered fast pitch 2?
Hi, the notebook demo for fastpitch is not working, the audio generated is purely noise. Can you please check?
Thank you