sh-lee-prml / HierSpeechpp

The official implementation of HierSpeech++
MIT License
1.13k stars 134 forks source link

Pitch detection #8

Closed dmylzenova closed 7 months ago

dmylzenova commented 7 months ago

Hello! Thank you for a great model! I am trying to fine-tune your model. You have quite uncommon predicted pitch value range. Could you please share what library did you use for pitch detection? Thank you!

dmylzenova commented 7 months ago

I have found in another discussion that you use yaapt. I suppose you simply divide values by 100, is that right?

sh-lee-prml commented 7 months ago

Thanks for your interest!

Please refer this issue for YAPPT details!

https://github.com/sh-lee-prml/HierSpeechpp/issues/4#issuecomment-1833299461

and, before fed to model, we normalize it with log-scale by

f0 = torch.log(f0+1)

the target value is also log-normalized F0! (We do not divide it by 100!)

dmylzenova commented 7 months ago

Oh my bad, thank you!