vlfeat / matconvnet

MatConvNet: CNNs for MATLAB
Other
1.4k stars 753 forks source link

beta23-compilation error with cudnn #715

Closed zwx8981 closed 8 years ago

zwx8981 commented 8 years ago

Need help for compilation with cudnn enabled using beta23.

My environment: Windows 10 Cuda7.5 Matlab2016a cudnn-r4 GTX 980m VS 2013

Compilation command: vl_compilenn('enableGpu', true, ... 'cudaMethod', 'nvcc', ... 'cudaRoot', 'c:\program files\nvidia gpu computing toolkit\cuda\v7.5', ... 'enableCudnn', true, ... 'cudnnRoot', 'G:\cudnn_r4') ;

Compilation error occurs when cudnn is enabled

Error information as follows:

Command "c:\program files\nvidia gpu computing toolkit\cuda\v7.5\bin\nvcc" -c "G:\MATLAB\R2016a\matconvnet-1.0-beta23\matlab\src\bits\impl\nnbilinearsampler_cudnn.cu" -DNDEBUG -DENABLE_GPU -DENABLE_CUDNN -I"G:\cudnn_r4\include" -DENABLE_DOUBLE -DSSSE3 -gencode=arch=compute_52,code=\"sm_52,compute_52\" -I"G:\MATLAB\R2016a\extern\include" -I"G:\MATLAB\R2016a\toolbox\distcomp\gpu\extern\include" -gencode=arch=compute_52,code=\"sm_52,compute_52\" -O3 -Xcompiler /MD --compiler-bindir "C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC..\VC\bin" -o "G:\MATLAB\R2016a\matconvnet-1.0-beta23\matlab\mex.build\bits\impl\nnbilinearsampler_cudnn.obj" failed.

Error in vl_compilenn (line 485) nvcc_compile(opts, srcs{i}, objfile, flags.nvcc) ;

What's the problem?

Thank you for your help.

marksunpeng commented 8 years ago

try to merge cudnn lib into CUDA root, and compile without (........'cudnnRoot', 'G:\cudnn_r4')

zwx8981 commented 8 years ago

@marksunpeng Hi, thank you for your advice. I tried it but it still doesn't work....

zwx8981 commented 8 years ago

It seems like the problem lies in nnbilinearsampler_cudnn.cu, because some other cudnn.cn files can be compiled successfully(e.g. nnconv_cudnn.cu).

marksunpeng commented 8 years ago

maybe, your version of cudann doesn't match cuda 7.5? (download new version and cover exists?)

zwx8981 commented 8 years ago

@marksunpeng yes, you are right! I tried to use cudnn5.1 and the aforementioned problem didn't happen anymore. However, new problem occurs as following:

All .cpp and .cu files were successfully compiled with cudnn enabled, but when it comes to line 498 in vl_compilenn(), i.e. the mex_link command failed and here is the error information:

错误使用 mex 正在创建库 G:\MATLAB\R2016a\matconvnet-1.0-beta23\matlab\mex\vl_nnconv.lib 和对象 G:\MATLAB\R2016a\matconvnet-1.0-beta23\matlab\mex\vl_nnconv.exp nnbilinearsampler_cudnn.obj : error LNK2019: 无法解析的外部符号 cudnnCreateSpatialTransformerDescriptor,该符号在函数 "public: static enum vl::ErrorCode cdecl vl::impl::nnbilinearsampler_cudnn<1>::backward(class vl::Context &,class vl::Tensor,class vl::Tensor,class vl::Tensor,class vl::Tensor,class vl::Tensor)" (?backward@?$nnbilinearsampler_cudnn@$00@impl@vl@@SA?AW4ErrorCode@3@AEAVContext@3@VTensor@3@1111@Z) 中被引用 nnbilinearsampler_cudnn.obj : error LNK2019: 无法解析的外部符号 cudnnSetSpatialTransformerNdDescriptor,该符号在函数 "public: static enum vl::ErrorCode __cdecl vl::impl::nnbilinearsampler_cudnn<1>::backward(class vl::Context &,class vl::Tensor,class vl::Tensor,class vl::Tensor,class vl::Tensor,class vl::Tensor)" (?backward@?$nnbilinearsampler_cudnn@$00@impl@vl@@SA?AW4ErrorCode@3@AEAVContext@3@VTensor@3@1111@Z) 中被引用 nnbilinearsampler_cudnn.obj : error LNK2019: 无法解析的外部符号 cudnnDestroySpatialTransformerDescriptor,该符号在函数 "public: static enum vl::ErrorCode cdecl vl::impl::nnbilinearsampler_cudnn<1>::backward(class vl::Context &,class vl::Tensor,class vl::Tensor,class vl::Tensor,class vl::Tensor,class vl::Tensor)" (?backward@?$nnbilinearsampler_cudnn@$00@impl@vl@@SA?AW4ErrorCode@3@AEAVContext@3@VTensor@3@1111@Z) 中被引用 nnbilinearsampler_cudnn.obj : error LNK2019: 无法解析的外部符号 cudnnSpatialTfSamplerForward,该符号在函数 "public: static enum vl::ErrorCode __cdecl vl::impl::nnbilinearsampler_cudnn<1>::forward(class vl::Context &,class vl::Tensor,class vl::Tensor,class vl::Tensor)" (?forward@?$nnbilinearsampler_cudnn@$00@impl@vl@@SA?AW4ErrorCode@3@AEAVContext@3@VTensor@3@11@Z) 中被引用 nnbilinearsampler_cudnn.obj : error LNK2019: 无法解析的外部符号 cudnnSpatialTfSamplerBackward,该符号在函数 "public: static enum vl::ErrorCode __cdecl vl::impl::nnbilinearsampler_cudnn<1>::backward(class vl::Context &,class vl::Tensor,class vl::Tensor,class vl::Tensor,class vl::Tensor,class vl::Tensor)" (?backward@?$nnbilinearsampler_cudnn@$00@impl@vl@@SA?AW4ErrorCode@3@AEAVContext@3@VTensor@3@1111@Z) 中被引用 G:\MATLAB\R2016a\matconvnet-1.0-beta23\matlab\mex\vl_nnconv.mexw64 : fatal error LNK1120: 5 个无法解析的外部命令

出错 vl_compilenn>mex_link (line 547) mex(mopts{:}) ;

出错 vl_compilenn (line 498) mex_link(opts, objs, mex_dir, flags.mexlink) ;

zwx8981 commented 8 years ago

@lenck

zwx8981 commented 8 years ago

Problem solved by copy the new version of cudnn to directory of cuda7.5, conclusion:

  1. Cudnn 5.1 or 5 is needed.
  2. Don't forget copy new cudnn files to cuda directory....