mit-han-lab / torchsparse

[MICRO'23, MLSys'22] TorchSparse: Efficient Training and Inference Framework for Sparse Convolution on GPUs.
https://torchsparse.mit.edu
MIT License
1.15k stars 131 forks source link

[Feature Request] Support for group convolution #149

Open digital-idiot opened 2 years ago

digital-idiot commented 2 years ago

Grouped convolution is supported well in pytorch's convolution layers / ops. If possible, it would a great to add that ability to torchsparse.

zhijian-liu commented 2 years ago

Thanks for bringing this up! The reason that we do not support grouped convolution is that it does not offer much speedup for sparse workloads. This is mainly because sparse convolution is memory-bounded instead of computation-bounded. That said, I think supporting this is still meaningful, and we probably need to do more optimization for it.

ruanych commented 2 years ago

I'm also interested in support for grouping operations~

For the design of lightweight networks such as MobileNet, the use of depthwise separable convolutions (that is, setting the number of groups to the number of input channels) can reduce the amount of parameters.

For a kernel size of KxK, C channels (assuming the same number of input and output channels) parameters 2D 3D
convolution KxKxCxC KxKxKxCxC
depthwise separable convolution KxKxC + 1x1xCxC KxKxKxC + 1x1x1xCxC

The bottleneck of computing relative to IO may be broken by increasing the size of the convolution kernel. RepLKNet made an attempt, arXiv: https://arxiv.org/abs/2203.06717.

zhijian-liu commented 2 years ago

Thanks for providing the model size perspective! We will take that into our consideration.

hontrn9122 commented 2 months ago

Is there any update on the group sparse convolution? I am trying to build some Capsule layers using 3d depthwise convolution.

yxchng commented 1 month ago

any updates on this?