A reshape problem in InvConvNear

shahuzi commented 3 years ago

Hi, @jaywalnut310 。I'm trying to understand the glow-tts by reading the code. And I am a little bit confused about this piece of code in InvConvNear。

https://github.com/jaywalnut310/glow-tts/blob/13e997689d643410f5d9f1f9a73877ae85e19bc2/modules.py#L214-L215

So if the purpose is reshape the input x from [b,c,t] to [b, self.n_split, c // self.n_split, t]，what's the purpose of the L214？

shahuzi commented 3 years ago

I know the purpose of https://github.com/jaywalnut310/glow-tts/blob/13e997689d643410f5d9f1f9a73877ae85e19bc2/modules.py#L214 is some kind of shuffle channel now, but I still not understand why this step is required.

bear-boy commented 12 months ago

I know the purpose of

https://github.com/jaywalnut310/glow-tts/blob/13e997689d643410f5d9f1f9a73877ae85e19bc2/modules.py#L214

is some kind of shuffle channel now, but I still not understand why this step is required.

对，这是一种channel shuffle的操作。至于为什么需要channel shuffle，其实作者论文中有提及：To allow channel mixing in each group, the same number of channels are extracted from one half of the feature map separated by coupling layers and the other half, respectively。具体来说，就是因为glow-tts里Inverse 11 conv是分组卷积的形式，失去了原本glow模型中Inverse 11 conv做channel shuffle的功能，所以这里需要“手动”做channel shuffle，把affine coupling中保持不变的一部分和参与运算的一部分在channel维上重组，部分的实现原本glow中Inverse 1*1 conv的功能。

jaywalnut310 / glow-tts

A reshape problem in InvConvNear #46