BVLC / caffe

Caffe: a fast open framework for deep learning.
http://caffe.berkeleyvision.org/
Other
33.96k stars 18.72k forks source link

Incompatible with current cudnn 8.0.3 ? #6970

Open peijason opened 3 years ago

peijason commented 3 years ago

Trying to build caffe 1.0.0 but failed against cudnn .

System configuration

Failed with the following ERROR message:

error: ‘CUDNN_CONVOLUTION_FWD_SPECIFY_WORKSPACE_LIMIT’ was not declared in this scope
error: ‘CUDNN_CONVOLUTION_BWD_FILTER_SPECIFY_WORKSPACE_LIMIT’ was not declared in this scope

etc. a lot ...

gembancud commented 3 years ago

Apparently so. I have a project study hinging on its use but the repo is stale. Windows works btw, CUDA 11, Cudnn 8.0.3.33

Qengineering commented 3 years ago

I've got the same problem with Caffe and cuDNN version 8.

As of version 8, NVIDIA has dropped the cudnnGetConvolutionBackwardFilterAlgorithm. The other two obsolete API calls, cudnnGetConvolutionForwardAlgorithm and cudnnGetConvolutionBackwardDataAlgorithm, have some replacement.

Because there is no replacement for the cudnnGetConvolutionBackwardFilterAlgorithm I've followed the strategy of the PaddlePaddle framework, by giving the outcome a constant CUDNN_CONVOLUTION_BWD_FILTER_ALGO_1 value and twice the memory earlier found with the cudnnGetConvolutionForwardAlgorithm.

I could request a merge in this repo, but not quite sure if the solution will work at all times, I decided to put it in our own GitHub repo first. If it turns out that it works fine, I will merge.

For now, please use this repo.

astropiu commented 3 years ago

I've got the same problem with Caffe and cuDNN version 8.

As of version 8, NVIDIA has dropped the cudnnGetConvolutionBackwardFilterAlgorithm. The other two obsolete API calls, cudnnGetConvolutionForwardAlgorithm and cudnnGetConvolutionBackwardDataAlgorithm, have some replacement.

Because there is no replacement for the cudnnGetConvolutionBackwardFilterAlgorithm I've followed the strategy of the PaddlePaddle framework, by giving the outcome a constant CUDNN_CONVOLUTION_BWD_FILTER_ALGO_1 value and twice the memory earlier found with the cudnnGetConvolutionForwardAlgorithm.

I could request a merge in this repo, but not quite sure if the solution will work at all times, I decided to put it in our own GitHub repo first. If it turns out that it works fine, I will merge.

For now, please use this repo.

I have followed their tutorial and used their repo, but I'm having this issue

CXX src/caffe/layers/softmax_layer.cpp src/caffe/layers/cudnn_conv_layer.cpp: In member function ‘virtual void caffe::CuDNNConvolutionLayer::Reshape(const std::vector<caffe::Blob>&, const std::vector<caffe::Blob>&)’: src/caffe/layers/cudnn_conv_layer.cpp:300:1: error: a template declaration cannot appear at block scope 300 | template | ^~~~ In file included from ./include/caffe/blob.hpp:8, from ./include/caffe/layers/cudnn_conv_layer.hpp:6, from src/caffe/layers/cudnn_conv_layer.cpp:5: src/caffe/layers/cudnn_conv_layer.cpp:331:1: error: expected primary-expression before ‘template’ 331 | INSTANTIATE_CLASS(CuDNNConvolutionLayer); | ^~~~~ src/caffe/layers/cudnn_conv_layer.cpp:331:1: error: expected primary-expression before ‘template’ 331 | INSTANTIATE_CLASS(CuDNNConvolutionLayer); | ^~~~~ src/caffe/layers/cudnn_conv_layer.cpp: At global scope: src/caffe/layers/cudnn_conv_layer.cpp:333:1: error: expected ‘}’ at end of input 333 | } // namespace caffe | ^ src/caffe/layers/cudnn_conv_layer.cpp:7:17: note: to match this ‘{’ 7 | namespace caffe { | ^ make: [Makefile:586: .build_release/src/caffe/layers/cudnn_conv_layer.o] Error 1 make: Waiting for unfinished jobs....

Qengineering commented 3 years ago

First guess, you are missing a brace somewhere. The first error only occurs when a template declaration appears within a function. For instance, there is no closing brace before the declaration starts. The latest 'mistakes' point in the same direction. The expected brace is missing here. Best to download the repo again.

mgomez0 commented 3 years ago

Hi, @Qengineering, I am having the exact same issue as @astropiu. I followed your instructions and also cloned the latest version of your repo. It is very strange, I inspected cudnn_conv_layer.cpp myself, and the braces seem to be fine. I'm wondering if we should continue this discussion here, or perhaps open a new issue on your repo.

src/caffe/layers/cudnn_conv_layer.cpp: In member function ‘virtual void caffe::CuDNNConvolutionLayer<Dtype>::Reshape(const std::vector<caffe::Blob<Dtype>*>&, const std::vector<caffe::Blob<Dtype>*>&)’: src/caffe/layers/cudnn_conv_layer.cpp:300:1: error: a template declaration cannot appear at block scope 300 | template <typename Dtype> | ^~~~~~~~ In file included from ./include/caffe/blob.hpp:8, from ./include/caffe/layers/cudnn_conv_layer.hpp:6, from src/caffe/layers/cudnn_conv_layer.cpp:5: src/caffe/layers/cudnn_conv_layer.cpp:331:1: error: expected primary-expression before ‘template’ 331 | INSTANTIATE_CLASS(CuDNNConvolutionLayer); | ^~~~~~~~~~~~~~~~~ src/caffe/layers/cudnn_conv_layer.cpp:331:1: error: expected primary-expression before ‘template’ 331 | INSTANTIATE_CLASS(CuDNNConvolutionLayer); | ^~~~~~~~~~~~~~~~~ src/caffe/layers/cudnn_conv_layer.cpp: At global scope: src/caffe/layers/cudnn_conv_layer.cpp:333:1: error: expected ‘}’ at end of input 333 | } // namespace caffe | ^ src/caffe/layers/cudnn_conv_layer.cpp:7:17: note: to match this ‘{’ 7 | namespace caffe { | ^ make: *** [Makefile:586: .build_release/src/caffe/layers/cudnn_conv_layer.o] Error 1 make: *** Waiting for unfinished jobs....

Qengineering commented 3 years ago

@mgomez0 You are more than welcome on my repo. I will review the code now and get back to you asap.

Qengineering commented 3 years ago

Solved the problem. In cudnn_conv_layer.cpp line 235:

} 
#endif

should be

#endif
 }
borisgribkov commented 2 years ago

@Qengineering Thanks for your caffe patch! I have applied it, but sometimes I observed strange behavior, for some models memory usage is about twice larger comparing to CUDA10-cudnn7 environment, has you observed something like this?

Qengineering commented 2 years ago

Indeed, in certain situations, the memory consumption is sustainably large than with cuDNN 7. It all has to do with the removed cudnnGetConvolutionBackwardDataAlgorithm and cudnnGetConvolutionBackwardFilterAlgorithm in version 8. These heuristic algorithms look for the best layout in memory and performance in CUDA memory. Since cuDNN version 8 no longer supports the routines, I had to generate some dummy output so that the following routines still work after the heuristic test. There are only four possible outcomes. I have selected the most common outcome (CUDNN_CONVOLUTION_BWD_FILTER_ALGO_1). At the same time, the required amount of memory is determined. For this purpose, I allocated twice the amount as determined in a previous (forward) determination. See line 169 in src/caffe/layers/cudnn_conv_layer.cpp. Twice, for safety reasons.

borisgribkov commented 2 years ago

I see, thank you!

borisgribkov commented 2 years ago

@Qengineering Thanks for your answer again! I agree with the backward pass. But as I see forward pass needs more memory too. I have tried a model with single conv layer and ( 20 3 1280 * 720 ) input, it's "head" of ResNet used for detection task. With cuda10 and cudnn7.6 I observed about 1.7Gb usage for a forward pass, for cuda 11 and cudnn8 ~ 2.6Gb. Maybe this comparison is not fully correct, because different GPUs were used, Titan XP in the first case and 3060 for the second.

Qengineering commented 2 years ago

Also in the forward pass, I had to make an educated guess about memory usage, as the cudnnGetConvolutionForwardAlgorithm is also missing in cuDNN 8. (see line 141 src/caffe/layers/cudnn_conv_layer.cpp)