BVLC / caffe

Caffe: a fast open framework for deep learning.
http://caffe.berkeleyvision.org/
Other
34.04k stars 18.7k forks source link

CUDNN doesn't support the dilated convolution #4961

Open thesby opened 7 years ago

thesby commented 7 years ago

CUDNN doesn't support the dilated convolution. I'm trying to use CUDNN engine to save GPU memory. Is it possible to use zeros to fill the dilated kernel where it shouldn't be convolved ?

naibaf7 commented 7 years ago

Yes, but depending on the amount of fill, the performance will go way down and memory consumption up, and you need to change some code to do so. LibDNN from the OpenCL branch (can also be used with CUDA) can do dilated convolution in any dimension without additional memory. It is not as fast as cuDNN though for convolutions without dilation. But you can mix the two and change the engine on a per-layer basis.

thesby commented 7 years ago

But I need cuDNN to reduce GPU memory. So it's impossible now except waiting for the update of cuDNN.

naibaf7 commented 7 years ago

@thesby libDNN uses the same amount or less memory than cuDNN, depending on the algorithm. Just the speed is not as high. So it's not impossible :)

thesby commented 7 years ago

@naibaf7 OK, thank you. But I found that there is no updating for a long time about libDNN. Is there any problem if I mix two conv engine -- CAFFE, CUDNN -- in conv layers in latest caffe?

naibaf7 commented 7 years ago

@thesby LibDNN standalone wasn't updated for a while, you have to use the Caffe-opencl branch for the latest LibDNN. You can mix two engines in one network, one engine per layer. Caffe dilated convolutions will be both too slow and too memory expensive. Use LibDNN layers for your dilated convolutions, Caffe layers for inner products and cuDNN layers for convolutions with regular filters.

thesby commented 7 years ago

@naibaf7 Thank you. I will try it.

shelhamer commented 7 years ago

Note that future cuDNN is expected to include dilated convolution.

KeyKy commented 7 years ago

I use the following layer in training and my problem is that training is fast, testing too slow. WHY? Thanks! layer { bottom: "res5a_branch2a" top: "res5a_branch2b" name: "res5a_branch2b" type: "Convolution" convolution_param { num_output: 512 kernel_size: 3 dilation: 2 pad: 2 stride: 1 bias_term: false } }

naibaf7 commented 7 years ago

@KeyKy What engine, device and Caffe are you using?

KeyKy commented 7 years ago

@naibaf7 sorry, is my counting mistake.

thesby commented 7 years ago

Is the dilated convolution supported now on latest caffe1.0 and latest cuDNN?

lizmcquarrie commented 7 years ago

SpatialDialatedConvolution is in the R6 branch of cudnn: https://github.com/soumith/cudnn.torch/tree/R6