Implement Approximate Depthwise Convolution Kernels

etrommer / torch-approx

GPU-accelerated Neural Network layers using Approximate Multiplications for PyTorch

https://etrommer.de/torch-approx

MIT License

7 stars 3 forks source link

Closed etrommer closed 2 years ago

etrommer commented 2 years ago

Benchmarking has shown that Im2Col + ApproxGeMM is extremely slow for Depthwise-Separable Convolution Operations.

This should be addressed by adding dedicated Approximate DWConv operators.