ivanvovk / WaveGrad

Implementation of WaveGrad high-fidelity vocoder from Google Brain in PyTorch.
BSD 3-Clause "New" or "Revised" License
402 stars 56 forks source link

The order of upsampling_dilations #22

Open junjun3518 opened 3 years ago

junjun3518 commented 3 years ago

Hi! My name is Junhyeok Lee and I appreciate your works! Maybe I found a slight mistake in your config file. https://github.com/ivanvovk/WaveGrad/blob/721c37c216132a2ef0a16adc38439f993998e0b7/configs/default.json#L6-L12 In Wavegrad Appendix A, they mentioned "The dilation factors of four convolutional layers are 1, 2, 4, 8 for the first three UBlocks and 1, 2, 1, 2 for the rest". Since they listed kernel sizes starting from the block closest to x(5,5,3,2,2), it seems that upsampling_dilations should be [[1,2,4,8]3, [1,2,1,2]2]. Could you confirm this?

ivanvovk commented 3 years ago

@junjun3518 Thank you very much. Checked it quickly, and seems like you are right, I am very sorry for that bug. I am planning to make a huge update of this repo to get better quality of generation, and this will be fixed.