auspicious3000 / SpeechSplit

Unsupervised Speech Decomposition Via Triple Information Bottleneck
http://arxiv.org/abs/2004.11284
MIT License
636 stars 92 forks source link

about Downsample Factor #59

Closed yangdongchao closed 2 years ago

yangdongchao commented 2 years ago

Another problem is your downsample Factor is set as 8. But if your input mel-spectrum shape is (501,80), we cannot recover resolution 501. Thanks for your time, looking forward to your reply.

auspicious3000 commented 2 years ago

just pad it to nearest multiple of 8