denizyuret / Knet.jl

Koç University deep learning framework.
https://denizyuret.github.io/Knet.jl/latest
Other
1.43k stars 230 forks source link

Latest GPU Installation, Pkg.test("Knet") Failure: lib not defined #593

Open radonnachie opened 4 years ago

radonnachie commented 4 years ago

With Julia v1.5, CUDA#master, Knet#master, CUDA Toolkit v11.0.3, GeForce Driver v451.82, Windows 10.

Before I detail the errors, I must say that I had faced the lib not defined before 1.3.6 was released and have read up on as many of @denizyuret's posts to similar (but now outdated) issue-posts.

I know that the lib not found error means that Knet was not able to locate the CUDA installation. I have the environment variable CUDA_PATH set to the CUDA installation. This pleases CUDA, but they mention that the variable's name could also be CUDA_HOME. I wonder if that is required for Knet.

Or are the Knet tests outdated, and their failure non-consequential?

I have my Nvidia drivers updated (v451.36), and CUDA Toolkit V11.0.3 installed:

(@v1.5) pkg> activate .
 Activating environment at `D:\Development\Code\Scripts\Julia\CNN\Project.toml`

(CNN) pkg> status
Status `D:\Development\Code\Scripts\Julia\CNN\Project.toml`
  [052768ef] CUDA v1.2.1 `https://github.com/JuliaGPU/CUDA.jl.git#master`
  [1902f260] Knet v1.3.9 `https://github.com/denizyuret/Knet.jl.git#master`

julia> using CUDA

julia> CUDA.version()
v"11.0.0"

julia> CUDA.functional()
true

julia> using Knet
┌ Debug: Knet using GPU 0
└ @ Knet C:\Users\Ross\.julia\packages\Knet\exwCE\src\Knet.jl:167

julia> using Pkg; Pkg.test("Knet")
......
┌ Debug: Knet using GPU 0
└ @ Knet C:\Users\Ross\.julia\packages\Knet\exwCE\src\Knet.jl:167
distributions.jl          2.060208 seconds (9.79 M allocations: 494.692 MiB, 9.74% gc time)
dropout.jl      ┌ Warning: `seed!(seed)` is deprecated, use `CUDA.seed!(seed)` instead.
│   caller = seed!(::Int64) at knetarray.jl:148
└ @ Knet C:\Users\Ross\.julia\packages\Knet\exwCE\src\cuarrays\knetarray.jl:148

Stacktrace:
 [1] dropout!(::Float64, ::KnetArray{Float64,2}, ::KnetArray{Float64,2}) at C:\Users\Ross\.julia\packages\Knet\exwCE\src\dropout.jl:62
 [2] dropout(::KnetArray{Float64,2}, ::Float64; seed::Int64, drop::Bool) at C:\Users\Ross\.julia\packages\Knet\exwCE\src\dropout.jl:25
 [3] forw(::Function, ::Param{KnetArray{Float64,2}}, ::Vararg{Any,N} where N; kwargs::Base.Iterators.Pairs{Symbol,Integer,Tuple{Symbol,Symbol},NamedTuple{(:seed, :drop),Tuple{Int64,Bool}}}) at C:\Users\Ross\.julia\packages\AutoGrad\VFrAv\src\core.jl:66
 [4] #dropout#859 at .\none:0 [inlined]
 [5] (::var"#dropout1#1")(::Param{KnetArray{Float64,2}}, ::Float64) at C:\Users\Ross\.julia\packages\Knet\exwCE\test\dropout.jl:4
 [6] gcsum(::Function, ::Param{KnetArray{Float64,2}}, ::Vararg{Any,N} where N; o::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at C:\Users\Ross\.julia\packages\AutoGrad\VFrAv\test\gradcheck.jl:50
 [7] gcsum(::Function, ::Param{KnetArray{Float64,2}}, ::Vararg{Any,N} where N) at C:\Users\Ross\.julia\packages\AutoGrad\VFrAv\test\gradcheck.jl:50
 [8] (::AutoGrad.var"#203#205"{Tuple{},var"#dropout1#1",Array{Any,1}})() at C:\Users\Ross\.julia\packages\AutoGrad\VFrAv\src\core.jl:205
 [9] differentiate(::Function; o::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at C:\Users\Ross\.julia\packages\AutoGrad\VFrAv\src\core.jl:144
 [10] differentiate at C:\Users\Ross\.julia\packages\AutoGrad\VFrAv\src\core.jl:135 [inlined]
 [11] gradcheck(::var"#dropout1#1", ::KnetArray{Float64,2}, ::Vararg{Any,N} where N; kw::Tuple{}, args::Int64, nsample::Int64, verbose::Int64, rtol::Float64, atol::Float64, delta::Float64) at C:\Users\Ross\.julia\packages\AutoGrad\VFrAv\test\gradcheck.jl:39
 [12] top-level scope at C:\Users\Ross\.julia\packages\Knet\exwCE\test\dropout.jl:9
 [13] top-level scope at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\Test\src\Test.jl:1115
 [14] top-level scope at C:\Users\Ross\.julia\packages\Knet\exwCE\test\dropout.jl:4
 [15] include(::String) at .\client.jl:457
 [16] macro expansion at .\timing.jl:174 [inlined]
 [17] macro expansion at C:\Users\Ross\.julia\packages\Knet\exwCE\test\runtests.jl:3 [inlined]
 [18] macro expansion at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\Test\src\Test.jl:1115 [inlined]
 [19] top-level scope at C:\Users\Ross\.julia\packages\Knet\exwCE\test\runtests.jl:12
 [20] include(::String) at .\client.jl:457
 [21] top-level scope at none:6
 [22] eval(::Module, ::Any) at .\boot.jl:331
 [23] exec_options(::Base.JLOptions) at .\client.jl:272
 [24] _start() at .\client.jl:506
dropout: Error During Test at C:\Users\Ross\.julia\packages\Knet\exwCE\test\dropout.jl:9
  Test threw exception
  Expression: gradcheck(dropout1, k, 0.5; args = 1)
  UndefVarError: lib not defined
  Stacktrace:
   [1] differentiate(::Function; o::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at C:\Users\Ross\.julia\packages\AutoGrad\VFrAv\src\core.jl:148
   [2] differentiate at C:\Users\Ross\.julia\packages\AutoGrad\VFrAv\src\core.jl:135 [inlined]
   [3] gradcheck(::var"#dropout1#1", ::KnetArray{Float64,2}, ::Vararg{Any,N} where N; kw::Tuple{}, args::Int64, nsample::Int64, verbose::Int64, rtol::Float64, atol::Float64, delta::Float64) at C:\Users\Ross\.julia\packages\AutoGrad\VFrAv\test\gradcheck.jl:39
   [4] top-level scope at C:\Users\Ross\.julia\packages\Knet\exwCE\test\dropout.jl:9
   [5] top-level scope at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\Test\src\Test.jl:1115
   [6] top-level scope at C:\Users\Ross\.julia\packages\Knet\exwCE\test\dropout.jl:4
  caused by [exception 1]
  UndefVarError: lib not defined
  Stacktrace:
   [1] dropout!(::Float64, ::KnetArray{Float64,2}, ::KnetArray{Float64,2}) at C:\Users\Ross\.julia\packages\Knet\exwCE\src\dropout.jl:62
   [2] dropout(::KnetArray{Float64,2}, ::Float64; seed::Int64, drop::Bool) at C:\Users\Ross\.julia\packages\Knet\exwCE\src\dropout.jl:25
   [3] forw(::Function, ::Param{KnetArray{Float64,2}}, ::Vararg{Any,N} where N; kwargs::Base.Iterators.Pairs{Symbol,Integer,Tuple{Symbol,Symbol},NamedTuple{(:seed, :drop),Tuple{Int64,Bool}}}) at C:\Users\Ross\.julia\packages\AutoGrad\VFrAv\src\core.jl:66
   [4] #dropout#859 at .\none:0 [inlined]
   [5] (::var"#dropout1#1")(::Param{KnetArray{Float64,2}}, ::Float64) at C:\Users\Ross\.julia\packages\Knet\exwCE\test\dropout.jl:4
   [6] gcsum(::Function, ::Param{KnetArray{Float64,2}}, ::Vararg{Any,N} where N; o::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at C:\Users\Ross\.julia\packages\AutoGrad\VFrAv\test\gradcheck.jl:50
   [7] gcsum(::Function, ::Param{KnetArray{Float64,2}}, ::Vararg{Any,N} where N) at C:\Users\Ross\.julia\packages\AutoGrad\VFrAv\test\gradcheck.jl:50
   [8] (::AutoGrad.var"#203#205"{Tuple{},var"#dropout1#1",Array{Any,1}})() at C:\Users\Ross\.julia\packages\AutoGrad\VFrAv\src\core.jl:205
   [9] differentiate(::Function; o::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at C:\Users\Ross\.julia\packages\AutoGrad\VFrAv\src\core.jl:144
   [10] differentiate at C:\Users\Ross\.julia\packages\AutoGrad\VFrAv\src\core.jl:135 [inlined]
   [11] gradcheck(::var"#dropout1#1", ::KnetArray{Float64,2}, ::Vararg{Any,N} where N; kw::Tuple{}, args::Int64, nsample::Int64, verbose::Int64, rtol::Float64, atol::Float64, delta::Float64) at C:\Users\Ross\.julia\packages\AutoGrad\VFrAv\test\gradcheck.jl:39
   [12] top-level scope at C:\Users\Ross\.julia\packages\Knet\exwCE\test\dropout.jl:9
   [13] top-level scope at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\Test\src\Test.jl:1115
   [14] top-level scope at C:\Users\Ross\.julia\packages\Knet\exwCE\test\dropout.jl:4

dropout: Error During Test at C:\Users\Ross\.julia\packages\Knet\exwCE\test\dropout.jl:12
  Test threw exception
  Expression: isapprox(sum(abs2, dropout1(k, 0.5)), sum(abs2, dropout1(a, 0.5)), rtol = 0.1)
  UndefVarError: lib not defined
  Stacktrace:
   [1] dropout!(::Float64, ::KnetArray{Float64,2}, ::KnetArray{Float64,2}) at C:\Users\Ross\.julia\packages\Knet\exwCE\src\dropout.jl:62
   [2] dropout(::KnetArray{Float64,2}, ::Float64; seed::Int64, drop::Bool) at C:\Users\Ross\.julia\packages\Knet\exwCE\src\dropout.jl:25
   [3] (::var"#dropout1#1")(::KnetArray{Float64,2}, ::Float64) at C:\Users\Ross\.julia\packages\Knet\exwCE\test\dropout.jl:4
   [4] top-level scope at C:\Users\Ross\.julia\packages\Knet\exwCE\test\dropout.jl:12
   [5] top-level scope at D:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.5\Test\src\Test.jl:1115
   [6] top-level scope at C:\Users\Ross\.julia\packages\Knet\exwCE\test\dropout.jl:4
denizyuret commented 4 years ago

Starting at v1.3.9 the libknet8.so library is downloaded as an Artifact and the CUDA toolkit is no longer needed (at least in theory). Could you check if v1.3.9 (latest released version) works?

radonnachie commented 4 years ago

The above was produced with v1.3.9 (I had used release before, but posted comes from the Knet master branch [see output of pkg> status), which did download libknet8.so.

I have everything working now with the following versions: (not working) -> (working)

Julia v1.5, CUDA#master, Knet#master, CUDA Toolkit v11.0.3, GeForce Driver v451.82, Windows 10. -> Julia v1.4, CUDA.jl#master v1.2.1, Knet v1.3.9, CUDA Toolkit 10.2.89 + CUDNN v8.0.2.39, GeForce Driver v451.82, Windows 10.

Sorry that I do not know what was wrong with using CUDA toolkit 11. I also am not in a position to risk the setup's integrity to test further. I'm even nervous to update Julia to v1.5...

As a notification though, Pkg.test("Knet") has a number of errors due to:

Got exception outside of a @test
could not load symbol "cudnnSetRNNDescriptor": The specified procedure could not be found.
denizyuret commented 4 years ago

This latest cudnnSetRNNDescriptor error is a new feature introduced by CUDNN v8. The next release of Knet (https://github.com/denizyuret/Knet.jl/pull/596) fixes this and hopefully makes the installation more robust.

radonnachie commented 4 years ago

Alright, at that point I will test the full setup again! (Julia v1.5, CUDA#master, Knet#master, CUDA Toolkit v11.0.3, GeForce Driver v451.82, Windows 10.)

denizyuret commented 4 years ago

Let's try this with Knet v1.4.0, see if it is fixed.