Order of powers of x make depthwise separable convolution behave poorly

Since x = cat([(x ** i) for i in range(1, self.q + 1)], dim=1) stacks the newly created powers of x along the channels dimension, it makes depthwise seperable convolution behave poorly since the filter passes over a mixture of channels and powers. If we look at the case where C_in=2 and q=3, the channel axis looks like: x₁, x₂, x₁², x₂², x₁³, x₂³.

With groups=2 in depthwise convolution, the first filter would pass over x₁, x₂, x₁², and the second filter would pass over x₂², x₁³, x₂³

I fixed this issue by re-arrranging the output of this line using fancy indexing: permutation = [(i * self.in_channels) % (self.in_channels * self.q) + i // self.q for i in range(self.in_channels * self.q)] x = cat([(x ** i) for i in range(1, self.q + 1)], dim=1)[:, permutation]

The channel axis now looks like x₁, x₁², x₁³, x₂, x₂², x₂³, so the filters pass over the powers of x₁ and x₂ individually.

https://github.com/junaidmalik09/fastonn/blob/9591a31f3960927e1e715d10df46b54b385cd94d/fastonn/SelfONN.py#L276

junaidmalik09 / fastonn

Order of powers of x make depthwise separable convolution behave poorly #8