Rudrabha / Lip2Wav

This is the repository containing codes for our CVPR, 2020 paper titled "Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis"
MIT License
692 stars 152 forks source link

Difference between mel.npz and ref.npz #23

Closed joannahong closed 3 years ago

joannahong commented 3 years ago

Thanks for such a great work! I am wondering why you use the encoded mel spectrogram (ref.npz) using pretrained model rather than directly using mel.npz? Does that because ref.npz contains more speaker info? Thank you!

prajwalkr commented 3 years ago

Does that because ref.npz contains more speaker info?

yes