Open shahuzi opened 3 years ago
I know the purpose of https://github.com/jaywalnut310/glow-tts/blob/13e997689d643410f5d9f1f9a73877ae85e19bc2/modules.py#L214 is some kind of shuffle channel now, but I still not understand why this step is required.
I know the purpose of
is some kind of shuffle channel now, but I still not understand why this step is required.
对,这是一种channel shuffle的操作。至于为什么需要channel shuffle,其实作者论文中有提及:To allow channel mixing in each group, the same number of channels are extracted from one half of the feature map separated by coupling layers and the other half, respectively。具体来说,就是因为glow-tts里Inverse 11 conv是分组卷积的形式,失去了原本glow模型中Inverse 11 conv做channel shuffle的功能,所以这里需要“手动”做channel shuffle,把affine coupling中保持不变的一部分和参与运算的一部分在channel维上重组,部分的实现原本glow中Inverse 1*1 conv的功能。
Hi, @jaywalnut310 。I'm trying to understand the glow-tts by reading the code. And I am a little bit confused about this piece of code in InvConvNear。
https://github.com/jaywalnut310/glow-tts/blob/13e997689d643410f5d9f1f9a73877ae85e19bc2/modules.py#L214-L215
So if the purpose is reshape the input x from
[b,c,t]
to[b, self.n_split, c // self.n_split, t]
,what's the purpose of the L214?