mit-han-lab / torchsparse

[MICRO'23, MLSys'22] TorchSparse: Efficient Training and Inference Framework for Sparse Convolution on GPUs.
https://torchsparse.mit.edu
MIT License
1.22k stars 143 forks source link

Can sparse convolution benefit from the pre-existing weights of dense convolution? #293

Closed ZzTodd22 closed 3 months ago

ZzTodd22 commented 8 months ago

Thank you immensely for sharing your work! I do have a query though: I've transitioned an identical model from dense convolution to sparse convolution. Despite meticulously loading the weights of all layers post-transition, I've yet to observe significant improvements. My question is, could the weights of the original dense convolution still influence the performance of the sparse convolution? In essence, can sparse convolution benefit from the pre-existing weights of dense convolution? Eagerly awaiting your insights on this matter!

ys-2020 commented 8 months ago

Hi @ZzTodd22! Thanks for your interests in TorchSparse! This is a very good question!

We do have some unit tests demonstrating that the sparse convolution can be mapped to the dense convolution in a layer-wise comparison. However, transforming dense pre-trained weights to sparse models might be more challenging, since the properties of sparse workloads may be very different. While the question is really interesting, we think the answer to it still remains open for further exploration.

Best regards, Shang