Closed purefunctor closed 7 months ago
Hello!
Thanks for the issue, and for the exploration of the "groups" functionality. I'll just add the description that I found in the Conv1D documentation:
groups controls the connections between inputs and outputs. in_channels and out_channels must both be divisible by groups. For example,
- At groups=1, all inputs are convolved to all outputs.
- At groups=2, the operation becomes equivalent to having two conv layers side by side, each seeing half the input channels and producing half the output channels, and both subsequently concatenated.
- At groups= in_channels, each input channel is convolved with its own set of filters (of size out_channels/in_channels).
I think I have a rough sense of how to implement this, but it may be a few weeks until I have time to really sit down and work on it. With that in mind, I guess I'll list out my workflow for how I would probably approach the problem, and then if you (or someone else) would like to take a shot at, that would be cool!
forward()
method to give the correct output. This is where having the test case with your reference PyTorch model is very helpful. My gut feeling is that you'll need to put this loop inside a loop over the number of groups.groups = 1
Anyway, hopefully this is helpful in case you or anyone wants to tackle this. If not, just give it a little time while I finish up some other things, and I can come back to this. If you do start working on it, feel free to message me with any questions or intermediate progress updates!
That's indeed really helpful, yup!
I'm trying to understand what the "state" and "state_cols" are in the convolutions, do you have any pointers on that?
Another good way to think about grouped convolutions is that each group can be thought of as if they had their own convolutions--as in, if I had a 9in->9out
convolution with 3 groups, I'd have 3 3in->3out
convolutions that'll get summed up.
Very cool! For state_cols
the idea is that it's a "helper" variable to store only the columns of the state that will be multiplied by the weights (they're not guaranteed to be contiguous depending on the dilation rate). See how the state_cols
are set here.
Hi, this is a really awesome project!
I'm trying to port a model that makes use of 1D convolutions but the immediate thing I ran into was that the Conv1D layer didn't have a parameter for
groups
, as present in PyTorch/Tensorflow. Learning resources on low-level NN programming is a little terse, but I'd like to tackle this!My (high-level) understanding of it goes something along the lines of:
Which gives:
In the case of
x
, withgroups=1
, each input channel has its own 1x1 kernel. This is reflected in the weights:For each output channel, there are 9 1x1 kernels which correspond to each input channel.
Meanwhile, for the case of
y
withgroups=3
, it has the following weights:For each output channel, there are 3 1x1 kernels which correspond to each input channel group.
An intuitive way I've found to see how this works is:
and this yields:
The result of
y(i)
yields significantly less "energy" thanx(i)
as each output channel now has less kernels to work with, 3 instead of 9.