spcl / sten

Sparsity support for PyTorch
31 stars 3 forks source link

Possibility to sparsify convolutional layers (torch.nn.Conv2d)? #2

Open SunHaozhe opened 1 year ago

SunHaozhe commented 1 year ago

I would like to turn an existing convolutional neural network (ResNet) into a sparse model. This model contains torch.nn.Conv2d but not torch.nn.Linear. Does sten support this operation?

I checked the tutorial notebook examples/modify_existing.ipynb. It seems that sten can only sparsify torch.nn.Linear, it cannot sparsify torch.nn.Conv2d? Is it the case?

and-ivanov commented 1 year ago

Sten can help to sparisfy any operator but you need to provide custom implementations for them as shown here examples/custom_implementations.ipynb. The choice of actual implementation depends on the sparse format you want to use and architecture (GPU or CPU).

SunHaozhe commented 1 year ago

Since torch.nn.Conv2d is a standard layer/operator in modern neural networks, does stenhave a plan to officially include a sparsified version of torch.nn.Conv2d?

I see that the actual implementation could depend on the choice of the sparse format and architecture (GPU or CPU), but I believe providing a default implementation of sparsified torch.nn.Conv2d could make sten more impactful. Users may just want to test out the effect of sparsifying torch.nn.Conv2d in classical CPUs and/or GPUs.

and-ivanov commented 1 year ago

We currently propose to use sparse convolution, implemented through multiplication by a dense matrix. The last time I checked available libraries, there were no implementations of sparse convolution that had a significant performance improvement over dense. If you can suggest any libraries, we can include support for them.

SunHaozhe commented 1 year ago

Sorry, could you please clarify what you mean by "sparse convolution, implemented through multiplication by a dense matrix"? Is there any existing implementation of what you are describing here?

and-ivanov commented 1 year ago

I mean first do an element-by-element multiplication of the input tensor and/or filter tensor by a mask tensor of zeros and ones. Then use a dense convolution as usual.

SunHaozhe commented 1 year ago

I mean first do an element-by-element multiplication of the input tensor and/or filter tensor by a mask tensor of zeros and ones. Then use a dense convolution as usual.

On what kinds of hardware would this implementation provide real speedup (in your opinion)?

and-ivanov commented 1 year ago

This implementation is not supposed to give any speedup, quite opposite, it will be slower than the non-sparse version. However, it may still be useful. For example, to evaluate how much accuracy can be preserved from the sparsification of the model.