fast-convolutions Search Results

1000+ results
for fast-convolutions

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

microsoft/DirectML #243

DirectML 1.8.2 depthwise convolution slow?

Hello, I love DirectML but I'm experiencing slow depthwise convolutions (with group count equal to channel count) with DirectML 1.8.2 compared to regular convolutions, when in fact the depthwise co…

momower1 updated 2 years ago
2
pytorch/functorch #365

Optimize convolution batching rule performance

On CUDA, when the convolution batching rule uses group convolutions, this sometimes ends up being slower that we expect on older hardware. This is probably because PyTorch's group convolution calls th…

zou3519 updated 2 years ago
7
kornia/kornia #1731

FFT backend for Filter2d

## 🚀 Feature An FFT backend for `kornia.filters.filter2d`. ## Motivation `kornia.filters.filter2d` is very slow for large kernels as it currently only performs convolutions in the spatial domai…

oliland updated 2 years ago
5
naibaf7/libdnn #25

Status of libdnn as of April 2018

Your library is pretty cool, but looks like it was not updated for a long period of time. At the same time, the version of libdnn in your Caffe fork seems to be more maintained and even got some ne…

romix updated 6 years ago
3
guoyww/AnimateDiff #308

⭐⭐⭐⭐⭐ Claude 3 - architectual redesign POC suggestion 100X …

https://paperswithcode.com/method/depthwise-separable-convolution#:~:text=While%20standard%20convolution%20performs%20the,a%20linear%20combination%20of%20the current setup ![Screenshot from 2024-0…

johndpope updated 7 months ago
1
clij/clij-custom-convolution-plugin #2

wiki feedback

I propose to adjust the [wiki](https://github.com/clij/clij-custom-convolution-plugin/wiki) a bit: > That trick works because real space convolution is very memory and compute expensive: You have t…

psteinb updated 5 years ago
2
Plonky3/Plonky3 #272

Optimize MDS Vectors for faster MDS convolutional layers

Currently, the MDS layers of size `8, 12, 16, 24, 32, 64` are implemented by doing a convolution with an MDS vector (Meaning a vector whose associated circulant matrix is MDS) `v` of appropriate size.…

SyxtonPrime updated 5 months ago
2
JuliaGPU/GPUArrays.jl #102

Convolutions and Pooling

See [here](https://github.com/JuliaImages/ImageFiltering.jl/issues/52) for a description of the operations involved. I expect that pooling will be relatively easy, convolutions are also straightfor…

MikeInnes updated 4 years ago
3
speechbrain/speechbrain #2733

Investigate custom Triton kernels for depthwise-separable co…

### Describe the bug A decent chunk of time in the Conformer model at training time is spent in the convolution module. Of that, a decent chunk is in the depthwise convolution, which sets `groups` to…

asumagic updated 1 week ago
2
ROCm/MIOpen #2250

Update IsApplicable() for fused winograd convolution

Current implementation of fused winograd convolution uses very limited subset of underlying kernel implementation. Current limitations are: - only 2x3 version - no dilation support - no grouped …

CAHEK7 updated 1 year ago
1

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for fast-convolutions

1000+ results
for fast-convolutions