Open wangrui5781 opened 6 years ago
I am not the author of this code, but this is my understanding.
Why you discard the skip connections in parallel wavenet which is used in wavenet? "The student network consisted of the same WaveNet architecture layout, except with different inputs and outputs and no skip connections. " Parallel WaveNet: Fast High-Fidelity Speech Synthesis
I find a local condition convoluted by mel between iaf and the last relu layer. What does it mean? This is just a different implementation. I don't know whether this is important or not (I think it is not) .
Parallel wavenet generate output x by mu-tot and s-tot , contrast to Clarinet, which regard the n-th sample z as output. What do you think about it? My understand of this is that Deepmind parallel Wavenet needs significant amount of sampling to compute KL loss which makes sense that a student sample is sampled from mu_tot and s_tot. Clarinet computes KL loss in a closed-form. Then, outputs of the last IAF flow can be used as student samples.
Thanks for your reply. I have corrected my mistake according to your answer. I read the Clarinet before parallel Wavenet so I don't take notice of the differences. Aha, that is not "严谨治学". However, both two models generate noisy voices, at least worse than the teacher. By STFT, I find that it can not learn high frequency distribution. Any idea to improve the model?
Hi, thank you for sharing this code and I find some differences comparing to wavenet paper. 1.Why you discard the skip connections in parallel wavenet which is used in wavenet?