facebookresearch / AudioDec

An Open-source Streaming High-fidelity Neural Audio Codec
Other
431 stars 20 forks source link

Some questions about CausalConvTranspose1d in conv_layer.py #11

Closed Jacksonroad closed 9 months ago

Jacksonroad commented 11 months ago

hello, thanks for your useful code. I don't figure out the class of CausalConvTranspose1d. why we select nn.ReplicationPad1d for stream pad not similar to CausalConv1d which pads constant 0?In CausalConvTranspose1d, I found self.pad_length is equal to 1 no matter kernel _size change values.But in CausalConv1d self.pad_length is relevant to kernel_size. Does self.pad_length have no links to kernel_size in CausalConvTranspose1d?So don't we change self.pad_length in CausalConvTranspose1d when we change its any parameters?

Jacksonroad commented 11 months ago

@likethesky @Celebio @colesbury @pdollar

bigpon commented 11 months ago

Hi, Thanks for the question. You are right. The pad_length should be related to both the kernel_size and stride. Since we fix the ratio of kernel_size/stride = 2, we can fix the pad_length to 1.

We handcrafted it because we followed that of the ParallelWaveGAN repo. However, it will be better to make it flexible for arbitrary kernel_size and stride settings. I may rewrite it later.

More details can be found in the following discussion. https://github.com/kan-bayashi/ParallelWaveGAN/pull/326 https://github.com/kan-bayashi/ParallelWaveGAN/commit/25c4b9a02b21ef1d464e61101ec6ce41014dbaa2

bigpon commented 10 months ago

The self.pad_length of CausalConvTranspose1d has been updated to "(math.ceil(kernel_size/stride) - 1)" for arbitrary kernel_size and stride settings.