Closed dmylzenova closed 7 months ago
I have found in another discussion that you use yaapt. I suppose you simply divide values by 100, is that right?
Thanks for your interest!
Please refer this issue for YAPPT details!
https://github.com/sh-lee-prml/HierSpeechpp/issues/4#issuecomment-1833299461
and, before fed to model, we normalize it with log-scale by
f0 = torch.log(f0+1)
the target value is also log-normalized F0! (We do not divide it by 100!)
Oh my bad, thank you!
Hello! Thank you for a great model! I am trying to fine-tune your model. You have quite uncommon predicted pitch value range. Could you please share what library did you use for pitch detection? Thank you!