Closed foreverlms closed 1 year ago
F(2, 1) means the kernel-size is 1, a point-wise convolution, which is not really convolution at all. Point-wise convolution is mathematically equivalent to matrix multiplication. There is no fast-convolution algorithm for it.
Hope that helps!
F(2, 1) means the kernel-size is 1, a point-wise convolution, which is not really convolution at all. Point-wise convolution is mathematically equivalent to matrix multiplication. There is no fast-convolution algorithm for it.
Hope that helps!
Thanks for your Reply, dear author. So I will consider using gemm directly for point-wise. I want to ask you for another question about the Fast covolution algorithm paper:
In the paper, the scattered weight matrix will be right prdoucted by the scattered Input matrix: $$ M = UV$$
But for some implementations and a picture of visualizing the Winograd:
It seems that:
$$ M = VU $$.
Am I wrong about this comprehension?
Thank you sir.
Nice graphic! Did you draw it yourself?
If we write M = UV, then U must have shape K x C and V must have shape C x P (where P is the number of tiles), and M has shape K x P. I like to write that as KC,CP->KP
If you want to write M = VU instead, then the shapes of M, V, and U are all transposed from the above shapes. I would write it as PC,CK->PK.
I don't think the diagram implies a particular tensor order.
Nice graphic! Did you draw it yourself?
If we write M = UV, then U must have shape K x C and V must have shape C x P (where P is the number of tiles), and M has shape K x P. I like to write that as KC,CP->KP
If you want to write M = VU instead, then the shapes of M, V, and U are all transposed from the above shapes. I would write it as PC,CK->PK.
I don't think the diagram implies a particular tensor order.
No, I didn't draw this diagram. It is cited from this paper: https://arxiv.org/pdf/1810.01973.pdf. So because the element-wise multiplication, You can write M =UV or M = VU. All you need to do is to organize the data.
(AB)' = B'A' where ' means matrix-transpose. This is a basic fact of matrix multiplication. It is a good idea to make sure you have mastered the basics of linear algebra. I learned it from "Calculus vol2" by Tom Apostol. There might be better books available now.
Hi, with an example:
For this paper: https://arxiv.org/pdf/2002.00552.pdf, it describes decomposable winorgad. So when kernel is 7x7, stride = 1, we can divide the convolution procedure to kernels: 4 3x3 kernel, 2 3x1 kernel, 2 1x3 kernel, 1 1x1 kernel and then apply winograd separately. But how to generate the A,G,B of F(2x2, 1x3), F(2x2,1x3), F(2x2, 1x1)? I know it could be nested. For F(2x2, 1x3), F(2,3) could be nested in outer F(2,1),so it is the problem: What's the A,G,B of F(2,1)?
Thank you.