pluskid / Mocha.jl

Deep Learning framework for Julia
Other
1.29k stars 254 forks source link

CuDNN v5 fails #198

Closed davidparks21 closed 6 years ago

davidparks21 commented 8 years ago

I installed the latest cuda libraries, cudnn64_5.dll in windows 10. I added the libraries to cudnn.jl and cublas.jl as necessary and ran into a bug in the ccall(...) in cudnn.jl.

Error: Bad Param (CUDNN_STATUS_BAD_PARAM specifically)

A few basic print statements @cudnncall macro show that the error is in the call to cudnnSetFilter4dDescriptor

Swapping out to CuDNN v4 seems to work. I'd love to come back to this and try to figure out how to fix it for v5, but for the moment I'll document it here, if I get that chance I'll post more.

I also would like to suggest 4 minor @assert statements that would have helped during my debugging process. When the system can't find the cuda libraries the runtime error is something like: can't load library "", which is cryptic and points you to the wrong place. The assert statements shown below cause a failure that points the user to the right place for troubleshoot. find_library returns an empty string when it doesn't find the dll's (common issue for new users).

Around line 37 in cudnn.jl

@windows? (
begin
  const libcudnn = Libdl.find_library(["cudnn64_70.dll", "cudnn64_65.dll", "cudnn32_70.dll", "cudnn32_65.dll", "cudnn64_4.dll"], [""])
  @assert (libcudnn != "") "Could not find a CUDA neural net DLL [cudnn64_70.dll, cudnn64_65.dll, cudnn32_70.dll, cudnn32_65.dll, cudnn64_4.dll]. See: http://mochajl.readthedocs.io/en/latest/user-guide/backend.html#cuda-backend"
end
: # linux or mac
begin
  const libcudnn = Libdl.find_library(["libcudnn"], [""])
  @assert (libcudnn != "") "Could not find CUDA neural net DLL [libcudnn]. See http://mochajl.readthedocs.io/en/latest/user-guide/backend.html#cuda-backend"
end)

Around line 37 in cublas.jl

@windows? (
begin
  const libcublas = Libdl.find_library(["cublas64_70.dll", "cublas64_65.dll", "cublas32_70.dll", "cublas32_65.dll", "cublas64_75.dll"], [""])
  @assert (libcublas != "") "Could not find CUDA DLL [cublas64_70.dll, cublas64_65.dll, cublas32_70.dll, cublas32_65.dll, cublas64_75.dll]. See: http://mochajl.readthedocs.io/en/latest/user-guide/backend.html#cuda-backend"
end
: # linux or mac
begin
  const libcublas = Libdl.find_library(["libcublas"], [""])
  @assert (libcublas != "") "Could not find CUDA DLL [libcublas]. See http://mochajl.readthedocs.io/en/latest/user-guide/backend.html#cuda-backend"
end)

p.s. Beautiful work on this framework, I'm really really impressed with it!

pluskid commented 8 years ago

@davidparks21 cuDNN v5 changed a lot, and unfortunately, cuDNN is quite famous for their backward incompatibility. This situation is worse since we are calling the lib functions directly without using a C header file. Some labor is needed to make it work for cuDNN v5, but I have not found time to do it recently.

I will incorporate your asserts into the code. Thanks!

phiber1 commented 7 years ago

It's almost a year later... Any hope we'll ever see cuDNN v5 support?

pluskid commented 6 years ago

cuDNN v5.1 and CUDA 8 is now supported (finally)