denizyuret / Knet.jl

Koç University deep learning framework.
https://denizyuret.github.io/Knet.jl/latest
Other
1.43k stars 230 forks source link

abnormal KnetArray error #125

Closed AStupidBear closed 7 years ago

AStupidBear commented 7 years ago

After using Pkg.update() yesterday, Knet.jl no longer works anymore for KnetArray. Pkg.test("Knet") failed, and

julia> a = KnetArray(ones(10));                

julia> a + a                                    
10-element Knet.KnetArray{Float64,1}:          
 0.00707721                                    
 0.000453814                                   
 0.000180473                                   
 0.0                                           
 0.0                                           
 0.0                                           
 0.0                                           
 0.0                                           
 0.0                                           
 0.0                                           

My configurations is:
Ubuntu 14.04 Julia 0.5.0 AutoGrad.jl master Knet.jl master

If you don't have this issue, then it may be a coincidence that my gpu crashed at the same time of updating.

AStupidBear commented 7 years ago

I have checked my GPU and there's no problem.

denizyuret commented 7 years ago

Knet 0.8.3 (which works with AutoGrad 0.0.7) was just released. Can you try with these? (If you haven't switched to Knet master, Pkg.update() and Pkg.build("Knet") should be sufficient to upgrade. If you have, you may need to run Pkg.free("AutoGrad") and Pkg.free("Knet") first).

On Sun, May 21, 2017 at 10:00 AM Yao Lu notifications@github.com wrote:

I have checked my GPU and there's no problem.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/denizyuret/Knet.jl/issues/125#issuecomment-302919227, or mute the thread https://github.com/notifications/unsubscribe-auth/ABvNpiW2GDUKZcmh0ZklqsMsqAX5M1Imks5r7-DzgaJpZM4NheG9 .

AStupidBear commented 7 years ago

Pkg.test("Knet") still failed.

AStupidBear commented 7 years ago

That's strange. ArraryFire.jl functions well.

denizyuret commented 7 years ago

Can you send me the error message, versioninfo(), Pkg.status()?

Did you try Pkg.build("Knet") after update?

Did you try a clean install: just set JULIA_PKGDIR to a temp directory and try Pkg.add("Knet") there. On Sun, May 21, 2017 at 2:27 PM Yao Lu notifications@github.com wrote:

That's strange.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/denizyuret/Knet.jl/issues/125#issuecomment-302930798, or mute the thread https://github.com/notifications/unsubscribe-auth/ABvNphJv94f5BwnLvGMSDTsy-ilXfb8cks5r8B-XgaJpZM4NheG9 .

AStupidBear commented 7 years ago

Some sample errors for a clean install

indexing: Test Failed                                                                                                                           
  Expression: a[i...] == k[i...]                                                                                                                
   Evaluated: [0.958081 0.638335 0.764553 0.548269; 0.921487 0.505448 0.739217 0.166488] == Knet.KnetArray{Float64,2}(Knet.KnetPtr(Ptr{Void} @0x
0000002309874c00,64,1,nothing),(2,4))                                                                                                           
 in record(::Base.Test.DefaultTestSet, ::Base.Test.Fail) at ./test.jl:428                                                                       
 in do_test(::Base.Test.Returned, ::Expr) at ./test.jl:281                                                                                      
 in macro expansion; at /tmp/v0.5/Knet/test/karray.jl:45 [inlined]                                                                              
 in macro expansion; at ./test.jl:672 [inlined]                                                                                                 
 in macro expansion; at /tmp/v0.5/Knet/test/karray.jl:22 [inlined]                                                                              
 in macro expansion; at ./test.jl:672 [inlined]                                                                                                 
 in macro expansion; at /tmp/v0.5/Knet/test/karray.jl:14 [inlined]                                                                              
 in anonymous at ./<missing>:?                                                                                                                  
 in include_from_node1(::String) at ./loading.jl:488 (repeats 2 times)                                                                          
 in process_options(::Base.JLOptions) at ./client.jl:262                                                                                        
 in _start() at ./client.jl:318                                                                                                                 
indexing: Test Failed                                                                                                                           
  Expression: a == k                                                                                                                            
   Evaluated: [0.0 0.0 0.0 0.0; 0.160203 0.486415 0.988154 0.918841; 0.0 0.0 0.0 0.0] == Knet.KnetArray{Float64,2}(Knet.KnetPtr(Ptr{Void} @0x000
000230985f800,96,1,nothing),(3,4))                                                                                                              
 in record(::Base.Test.DefaultTestSet, ::Base.Test.Fail) at ./test.jl:428                                                                       
 in do_test(::Base.Test.Returned, ::Expr) at ./test.jl:281                                                                                      
 in macro expansion; at /tmp/v0.5/Knet/test/karray.jl:49 [inlined]                                                                              
 in macro expansion; at ./test.jl:672 [inlined]                                                                                                 
 in macro expansion; at /tmp/v0.5/Knet/test/karray.jl:22 [inlined]                                                                              
 in macro expansion; at ./test.jl:672 [inlined]                                                                                                 
 in macro expansion; at /tmp/v0.5/Knet/test/karray.jl:14 [inlined]                                                                              
 in anonymous at ./<missing>:?                                                                                                                  
 in include_from_node1(::String) at ./loading.jl:488 (repeats 2 times)                                                                          
 in process_options(::Base.JLOptions) at ./client.jl:262                                                                                        
 in _start() at ./client.jl:318                                                                                                                 
indexing: Test Failed                                                                                                                           
  Expression: gradcheck(getindex,k,i...)                                                                                                        
 in record(::Base.Test.DefaultTestSet, ::Base.Test.Fail) at ./test.jl:428                                                                       
 in do_test(::Base.Test.Returned, ::Expr) at ./test.jl:281                                                                                      
 in macro expansion; at /tmp/v0.5/Knet/test/karray.jl:54 [inlined]                                                                              
 in macro expansion; at ./test.jl:672 [inlined]                                                                                                 
 in macro expansion; at /tmp/v0.5/Knet/test/karray.jl:22 [inlined]                                                                              
 in macro expansion; at ./test.jl:672 [inlined]                                                                                                 
 in macro expansion; at /tmp/v0.5/Knet/test/karray.jl:14 [inlined]                                                                              
 in anonymous at ./<missing>:?                                                                                                                  
 in include_from_node1(::String) at ./loading.jl:488 (repeats 2 times)                                                                          
 in process_options(::Base.JLOptions) at ./client.jl:262                                                                                        
 in _start() at ./client.jl:318                                                                                                                 
indexing: Test Failed                                                                                                                           
  Expression: a[i...] == k[i...]                                                                                                                
   Evaluated: [0.958081 0.764553; 0.160203 0.988154; 0.921487 0.739217] == Knet.KnetArray{Float64,2}(Knet.KnetPtr(Ptr{Void} @0x000000230985f600,
48,1,nothing),(3,2))                                                                                                                            
 in record(::Base.Test.DefaultTestSet, ::Base.Test.Fail) at ./test.jl:428                                                                       
 in do_test(::Base.Test.Returned, ::Expr) at ./test.jl:281                                                                                      
 in macro expansion; at /tmp/v0.5/Knet/test/karray.jl:45 [inlined]                                                                              
 in macro expansion; at ./test.jl:672 [inlined]                                                                                                 
 in macro expansion; at /tmp/v0.5/Knet/test/karray.jl:22 [inlined]                                                                              
 in macro expansion; at ./test.jl:672 [inlined]                                                                                                 
 in macro expansion; at /tmp/v0.5/Knet/test/karray.jl:14 [inlined]                                                                              
 in anonymous at ./<missing>:?                                                                                                                  
 in include_from_node1(::String) at ./loading.jl:488 (repeats 2 times)                                                                          
 in process_options(::Base.JLOptions) at ./client.jl:262                                                                                        
 in _start() at ./client.jl:318                                                                                                                 
indexing: Test Failed                                                                                                                           
  Expression: a == k                                                                                                                            
   Evaluated: [0.0 0.638335 0.0 0.548269; 0.0 0.486415 0.0 0.918841; 0.0 0.505448 0.0 0.166488] == Knet.KnetArray{Float64,2}(Knet.KnetPtr(Ptr{Vo
id} @0x000000230985f800,96,1,nothing),(3,4))                                                                                                    
 in record(::Base.Test.DefaultTestSet, ::Base.Test.Fail) at ./test.jl:428                                                                       
 in do_test(::Base.Test.Returned, ::Expr) at ./test.jl:281                                                                                      
 in macro expansion; at /tmp/v0.5/Knet/test/karray.jl:49 [inlined]                                                                              
 in macro expansion; at ./test.jl:672 [inlined]                                                                                                 
 in macro expansion; at /tmp/v0.5/Knet/test/karray.jl:22 [inlined]                                                                              
 in macro expansion; at ./test.jl:672 [inlined]                                                                                                 
 in macro expansion; at /tmp/v0.5/Knet/test/karray.jl:14 [inlined]                                                                              
 in anonymous at ./<missing>:?                                                                                                                  
 in include_from_node1(::String) at ./loading.jl:488 (repeats 2 times)                                                                          
 in process_options(::Base.JLOptions) at ./client.jl:262                                                                                        
 in _start() at ./client.jl:318                                                                                                                 
indexing: Test Failed                                                                                                                           
  Expression: gradcheck(getindex,k,i...)                                                                                                        
 in record(::Base.Test.DefaultTestSet, ::Base.Test.Fail) at ./test.jl:428                                                                       
 in do_test(::Base.Test.Returned, ::Expr) at ./test.jl:281                                                                                      
 in macro expansion; at /tmp/v0.5/Knet/test/karray.jl:54 [inlined]                                                                              
 in macro expansion; at ./test.jl:672 [inlined]                                                                                                 
 in macro expansion; at /tmp/v0.5/Knet/test/karray.jl:22 [inlined]                                                                              
 in macro expansion; at ./test.jl:672 [inlined]                                                                                                 
 in macro expansion; at /tmp/v0.5/Knet/test/karray.jl:14 [inlined]                                                                              
 in anonymous at ./<missing>:?                                                                                                                  
 in include_from_node1(::String) at ./loading.jl:488 (repeats 2 times)                                                                          
 in process_options(::Base.JLOptions) at ./client.jl:262                                                                                        
 in _start() at ./client.jl:318                                                                                                                 
indexing: Test Failed                                                                                                                           
  Expression: a[i...] == k[i...]                                                                                                                
   Evaluated: [0.958081,0.921487] == Knet.KnetArray{Float64,1}(Knet.KnetPtr(Ptr{Void} @0x000000230987da00,16,1,nothing),(2,))                   
 in record(::Base.Test.DefaultTestSet, ::Base.Test.Fail) at ./test.jl:428                                                                       
 in do_test(::Base.Test.Returned, ::Expr) at ./test.jl:281                                                                                      
 in macro expansion; at /tmp/v0.5/Knet/test/karray.jl:45 [inlined]                                                                              
 in macro expansion; at ./test.jl:672 [inlined]                                                                                                 
 in macro expansion; at /tmp/v0.5/Knet/test/karray.jl:22 [inlined]                                                                              
 in macro expansion; at ./test.jl:672 [inlined]                                                                                                 
 in macro expansion; at /tmp/v0.5/Knet/test/karray.jl:14 [inlined]                                                                              
 in anonymous at ./<missing>:?                                                                                                                  
 in include_from_node1(::String) at ./loading.jl:488 (repeats 2 times)                                                                          
 in process_options(::Base.JLOptions) at ./client.jl:262                                                                                        
 in _start() at ./client.jl:318                                                                                                                 
indexing: Test Failed                                                                                                                           
  Expression: a == k                                                                                                                            
   Evaluated: [0.0 0.638335 0.764553 0.548269; 0.160203 0.486415 0.988154 0.918841; 0.0 0.505448 0.739217 0.166488] == Knet.KnetArray{Float64,2}
(Knet.KnetPtr(Ptr{Void} @0x000000230985f800,96,1,nothing),(3,4))                                                                                
 in record(::Base.Test.DefaultTestSet, ::Base.Test.Fail) at ./test.jl:428                                                                       
 in do_test(::Base.Test.Returned, ::Expr) at ./test.jl:281                                                                                      
 in macro expansion; at /tmp/v0.5/Knet/test/karray.jl:49 [inlined]                                                                              
 in macro expansion; at ./test.jl:672 [inlined]                                                                                                 
 in macro expansion; at /tmp/v0.5/Knet/test/karray.jl:22 [inlined]                                                                              
 in macro expansion; at ./test.jl:672 [inlined]                                                                                                 
 in macro expansion; at /tmp/v0.5/Knet/test/karray.jl:14 [inlined]                                                                              
 in anonymous at ./<missing>:?                                                                                                                  
 in include_from_node1(::String) at ./loading.jl:488 (repeats 2 times)                                                                          
 in process_options(::Base.JLOptions) at ./client.jl:262                                                                                        
 in _start() at ./client.jl:318                                                                                                                 
indexing: Test Failed                                                                                                                           
  Expression: gradcheck(getindex,k,i...)                                                                                                        
 in record(::Base.Test.DefaultTestSet, ::Base.Test.Fail) at ./test.jl:428                                                                       
 in do_test(::Base.Test.Returned, ::Expr) at ./test.jl:281                                                                                      
 in macro expansion; at /tmp/v0.5/Knet/test/karray.jl:54 [inlined]                                                                              
 in macro expansion; at ./test.jl:672 [inlined]                                                                                                 
 in macro expansion; at /tmp/v0.5/Knet/test/karray.jl:22 [inlined]                                                                              
 in macro expansion; at ./test.jl:672 [inlined]                                                                                                 
 in macro expansion; at /tmp/v0.5/Knet/test/karray.jl:14 [inlined]                                                                              
 in anonymous at ./<missing>:?                                                                                                                  
 in include_from_node1(::String) at ./loading.jl:488 (repeats 2 times)                                                                          
 in process_options(::Base.JLOptions) at ./client.jl:262                                                                                        
 in _start() at ./client.jl:318                                                                                                                 
indexing: Test Failed                                                                                                                           
  Expression: a[i...] == k[i...]                                                                                                                
   Evaluated: [0.160203,0.160203] == Knet.KnetArray{Float64,1}(Knet.KnetPtr(Ptr{Void} @0x0000002309886400,16,1,nothing),(2,))                   
 in record(::Base.Test.DefaultTestSet, ::Base.Test.Fail) at ./test.jl:428                                                                       
 in do_test(::Base.Test.Returned, ::Expr) at ./test.jl:281                                                                                      
 in macro expansion; at /tmp/v0.5/Knet/test/karray.jl:45 [inlined]                                                                              
 in macro expansion; at ./test.jl:672 [inlined]                                                                                                 
 in macro expansion; at /tmp/v0.5/Knet/test/karray.jl:22 [inlined]                                                                              
 in macro expansion; at ./test.jl:672 [inlined]                                                                                                 
 in macro expansion; at /tmp/v0.5/Knet/test/karray.jl:14 [inlined]                                                                              
 in anonymous at ./<missing>:?                                                                                                                  
 in include_from_node1(::String) at ./loading.jl:488 (repeats 2 times)                                                                          
 in process_options(::Base.JLOptions) at ./client.jl:262                                                                                        
 in _start() at ./client.jl:318                                                                                                                 
indexing: Test Failed                                                                                                                           
  Expression: a == k                                                                                                                            
   Evaluated: [0.958081 0.638335 0.764553 0.548269; 0.0 0.486415 0.988154 0.918841; 0.921487 0.505448 0.739217 0.166488] == Knet.KnetArray{Float
64,2}(Knet.KnetPtr(Ptr{Void} @0x000000230985f800,96,1,nothing),(3,4))                                                                           
 in record(::Base.Test.DefaultTestSet, ::Base.Test.Fail) at ./test.jl:428                                                                       
 in do_test(::Base.Test.Returned, ::Expr) at ./test.jl:281                                                                                      
 in macro expansion; at /tmp/v0.5/Knet/test/karray.jl:49 [inlined]                                                                              
 in macro expansion; at ./test.jl:672 [inlined]                                                                                                 
 in macro expansion; at /tmp/v0.5/Knet/test/karray.jl:22 [inlined]                                                                              
 in macro expansion; at ./test.jl:672 [inlined]                                                                                                 
 in macro expansion; at /tmp/v0.5/Knet/test/karray.jl:14 [inlined]                                                                              
 in anonymous at ./<missing>:?                                                                                                                  
 in include_from_node1(::String) at ./loading.jl:488 (repeats 2 times)                                                                          
 in process_options(::Base.JLOptions) at ./client.jl:262                                                                                        
 in _start() at ./client.jl:318                                                                                                                 
indexing: Test Failed                                                                                                                           
  Expression: gradcheck(getindex,k,i...)                                                                                                        
 in record(::Base.Test.DefaultTestSet, ::Base.Test.Fail) at ./test.jl:428                                                                       
 in do_test(::Base.Test.Returned, ::Expr) at ./test.jl:281                                                                                      
 in macro expansion; at /tmp/v0.5/Knet/test/karray.jl:54 [inlined]                                                                              
 in macro expansion; at ./test.jl:672 [inlined]                                                                                                 
 in macro expansion; at /tmp/v0.5/Knet/test/karray.jl:22 [inlined]                                                                              
 in macro expansion; at ./test.jl:672 [inlined]                                                                                                 
 in macro expansion; at /tmp/v0.5/Knet/test/karray.jl:14 [inlined]                                                                              
 in anonymous at ./<missing>:?                                                                                                                  
 in include_from_node1(::String) at ./loading.jl:488 (repeats 2 times)                                                                          
AStupidBear commented 7 years ago
julia> versioninfo()                                                     
Julia Version 0.5.0                                                      
Commit 3c9d753 (2016-09-19 18:14 UTC)                                    
Platform Info:                                                           
  System: Linux (x86_64-pc-linux-gnu)                                    
  CPU: Intel(R) Xeon(R) CPU E5-2676 v3 @ 2.40GHz                         
  WORD_SIZE: 64                                                          
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)       
  LAPACK: libopenblas64_                                                 
  LIBM: libopenlibm                                                      
  LLVM: libLLVM-3.7.1 (ORCJIT, haswell)                                  
AStupidBear commented 7 years ago
julia> Pkg.status()
1 required packages:
 - Knet                          0.8.3
2 additional packages:
 - AutoGrad                      0.0.7
 - Compat                        0.25.2
denizyuret commented 7 years ago

These look normal. How about cuda library versions? Can get those using include(Pkg.dir("Knet/test/gpu.jl")). And the model of gpu (run nvidia-smi in shell). Also try "make clean; make" under .julia/v0.5/Knet/src.

On Sun, May 21, 2017, 17:08 Yao Lu notifications@github.com wrote:

julia> Pkg.status() 1 required packages:

  • Knet 0.8.3 2 additional packages:
  • AutoGrad 0.0.7
  • Compat 0.25.2

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/denizyuret/Knet.jl/issues/125#issuecomment-302938982, or mute the thread https://github.com/notifications/unsubscribe-auth/ABvNpsiXjCW_DRUajwkUlEiLSY5qoBMmks5r8EVKgaJpZM4NheG9 .

AStupidBear commented 7 years ago
julia> include(Pkg.dir("Knet/test/gpu.jl"))
(Knet.cudaDriverVersion,Knet.cudaRuntimeVersion,Knet.cublasVersion,Knet.cudnnVersion) = (8000,8000,8000,5105)
Test Summary: | Pass  Total
  gpu         |    8      8
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.39                 Driver Version: 375.39                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 980 Ti  Off  | 0000:03:00.0      On |                  N/A |
| 26%   38C    P8    21W / 260W |     44MiB /  6075MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 980 Ti  Off  | 0000:81:00.0     Off |                  N/A |
| 20%   44C    P8    16W / 260W |      3MiB /  6078MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0      1871    G   /usr/lib/xorg/Xorg                              40MiB |
+-----------------------------------------------------------------------------+
AStupidBear commented 7 years ago

After running make clean; make the problem is still there.

Have you tested Knet on other computers with GPUs? I cannot access other testing platforms.

denizyuret commented 7 years ago

There is a machine image Knet-0.8.3 on amazon aws ohio. It has also been tested on K20, K40, K80 gpus, other gpus on amazon, and cpu only linux and macs.

Your errors seem to be with float64 arrays. Is it possible your gpu does not support float64?

On Mon, May 22, 2017, 16:43 Yao Lu notifications@github.com wrote:

After running make clean; make the problem is still there.

Have you tested Knet on other computers with GPUs? I cannot access other testing platforms.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/denizyuret/Knet.jl/issues/125#issuecomment-303103889, or mute the thread https://github.com/notifications/unsubscribe-auth/ABvNptecp_nFqNgIyi6RFnoQQ39BZiBuks5r8ZEMgaJpZM4NheG9 .

AStupidBear commented 7 years ago
julia> using Knet; a = KnetArray(ones(Float32, 10))               
10-element Knet.KnetArray{Float32,1}:                             
 1.0                                                              
 1.0                                                              
 1.0                                                              
 1.0                                                              
 1.0                                                              
 1.0                                                              
 1.0                                                              
 1.0                                                              
 1.0                                                              
 1.0                                                              

julia> a + a                                                      
10-element Knet.KnetArray{Float32,1}:                             
 2.29589f-41                                                      
 2.3319f-41                                                       
 2.36819f-41                                                      
 2.40477f-41                                                      
 2.44162f-41                                                      
 2.47876f-41                                                      
 2.51617f-41                                                      
 2.55387f-41                                                      
 2.59184f-41                                                      
 2.6301f-41
AStupidBear commented 7 years ago
julia> using ArrayFire; setBackend(AF_BACKEND_CUDA); a = AFArray(ones(Float32, 10)); a + a
10-element ArrayFire.AFArray{Float32,1}:
 2.0
 2.0
 2.0
 2.0
 2.0
 2.0
 2.0
 2.0
 2.0
 2.0

Does this mean my hardware and nvidia driver are OK?

AStupidBear commented 7 years ago

The test for CUBLAS.jl also passed.

julia> Pkg.test("CUBLAS")
INFO: Testing CUBLAS
WARNING: Method definition test_trsm(Any) in module Main at /home/luyao/.julia/v0.5/CUBLAS/test/runtests.jl:1362 overwritten at /home/luyao/.julia/v0.5/CUBLAS/test/runtests.jl:1417.
INFO: CUBLAS tests passed
denizyuret commented 7 years ago

Hi Yao: Thanks for your support debugging this issue. Can you confirm that Knet-0.8.2 and AutoGrad-0.0.6 were working ok before the upgrade?

Enis, I think there is a problem with our new kernels running on the GTX-980 GPU. Are you using any instructions that are not supported on this platform? If so can we make them conditional on the chipset? Yao has found out that a+b is not working for KnetArrays!

best, deniz

On Tue, May 23, 2017 at 3:57 AM Yao Lu notifications@github.com wrote:

The test https://github.com/JuliaGPU/CUBLAS.jl/blob/master/test/runtests.jl for CUBLAS.jl also passed.

julia> Pkg.test("CUBLAS") INFO: Testing CUBLAS WARNING: Method definition test_trsm(Any) in module Main at /home/luyao/.julia/v0.5/CUBLAS/test/runtests.jl:1362 overwritten at /home/luyao/.julia/v0.5/CUBLAS/test/runtests.jl:1417. INFO: CUBLAS tests passed

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/denizyuret/Knet.jl/issues/125#issuecomment-303260049, or mute the thread https://github.com/notifications/unsubscribe-auth/ABvNpo5gb0D9JLc2Mgca3pnusMICKVAnks5r8i7sgaJpZM4NheG9 .

EnisBerk commented 7 years ago

Hi All, a+a is same size array addition so it is not computed by new kernels, it is computed by src/cuda11.jl, Yao, can you try with arrays of different sizes or share results of test/broadcast.jl

other than that we have "-arch=compute_30 -code=sm_30 " flags in line 12 of Makefile for minimum compute architecture used in cuda13.jl, but in code we handled those cases with conditionals. Also Yao's architecture is suitable (NVIDIA-SMI 375.39 ).

Should we remove "-arch=compute_30 -code=sm_30 " and test again ?

best, Enis Berk

denizyuret commented 7 years ago

I think the problem may not be with the driver version but the compute capability of the chip itself: GTX-980. Do we have any way to access machines with that compute capability for testing?

On Tue, May 23, 2017 at 10:05 AM Enis Berk Çoban notifications@github.com wrote:

Hi All, a+a is same size array addition so it is not computed by new kernels, it is computed by src/cuda11.jl, Yao, can you try with arrays of different sizes or share results of test/broadcast.jl

other than that we have "-arch=compute_30 -code=sm_30 " flags in line 12 of Makefile for minimum compute architecture used in cuda13.jl, but in code we handled those cases with conditionals. Also Yao's architecture is suitable (NVIDIA-SMI 375.39 ).

Should we remove "-arch=compute_30 -code=sm_30 " and test again ?

best, Enis Berk

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/denizyuret/Knet.jl/issues/125#issuecomment-303309239, or mute the thread https://github.com/notifications/unsubscribe-auth/ABvNpicQRF0TqIdKKRma6tZOWqT1Jw05ks5r8oVLgaJpZM4NheG9 .

AStupidBear commented 7 years ago

After using

Pkg.pin("Knet", v"0.8.2")
Pkg.pin("AutoGrad", v"0.0.6")

and

cd ~/.julia/v0.5/Knet/src && make clean && make

the test passed.

julia> Pkg.test("Knet")                                                                      
INFO: Computing test dependencies for Knet...                                                
INFO: Installing BaseTestNext v0.2.2                                                         
INFO: Testing Knet                                                                           
INFO: Knet using GPU 1                                                                       
Test Summary: | Pass  Total                                                                  
  kptr        |    1      1                                                                  
  1.878766 seconds (1.12 M allocations: 49.284 MB, 13.09% gc time)                           
(Knet.cudaDriverVersion,Knet.cudaRuntimeVersion,Knet.cublasVersion,Knet.cudnnVersion) = (8000
Test Summary: | Pass  Total                                                                  
  gpu         |    8      8                                                                  
  0.895661 seconds (207.56 k allocations: 8.691 MB, 0.55% gc time)                           
Test Summary: | Pass  Total                                                                  
  distributions |    3      3                                                                
  1.559528 seconds (1.03 M allocations: 43.159 MB, 1.02% gc time)                            
Test Summary: | Pass  Total                                                                  
  karray      |  122    122                                                                  
 12.671813 seconds (7.98 M allocations: 332.267 MB, 0.86% gc time)                           
Test Summary: | Pass  Total                                                                  
  linalg      |  112    112                                                                  
 18.156013 seconds (9.72 M allocations: 405.184 MB, 0.76% gc time)                           
Test Summary: | Pass  Total                                                                  
  broadcast   | 3624   3624                                                                  
 30.212741 seconds (13.17 M allocations: 575.725 MB, 1.04% gc time)                          
Test Summary: | Pass  Total                                                                  
  reduction   | 1188   1188                                                                  
 25.244122 seconds (13.50 M allocations: 561.333 MB, 0.87% gc time)                          
Test Summary: | Pass  Total                                                                  
  conv        |  139    139                                                                  
 57.717182 seconds (15.43 M allocations: 623.457 MB, 0.56% gc time)                          
Test Summary: | Pass  Total                                                                  
  unary       | 1720   1720                                                                  
 41.423803 seconds (13.89 M allocations: 587.040 MB, 0.71% gc time)                          
denizyuret commented 7 years ago

Hi Yao: could you try Enis' suggestion and recompile after changing the default options in Knet/src/Makefile: Should we remove "-arch=compute_30 -code=sm_30 " and test again ?

AStupidBear commented 7 years ago

@denizyuret @EnisBerk Thanks for your suggestions. The test passed after removing "-arch=compute_30 -code=sm_30" and make again.

denizyuret commented 7 years ago

bugun 9 11 arasi musait.

On Tue, May 23, 2017, 10:05 Enis Berk Çoban notifications@github.com wrote:

Hi All, a+a is same size array addition so it is not computed by new kernels, it is computed by src/cuda11.jl, Yao, can you try with arrays of different sizes or share results of test/broadcast.jl

other than that we have "-arch=compute_30 -code=sm_30 " flags in line 12 of Makefile for minimum compute architecture used in cuda13.jl, but in code we handled those cases with conditionals. Also Yao's architecture is suitable (NVIDIA-SMI 375.39 ).

Should we remove "-arch=compute_30 -code=sm_30 " and test again ?

best, Enis Berk

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/denizyuret/Knet.jl/issues/125#issuecomment-303309239, or mute the thread https://github.com/notifications/unsubscribe-auth/ABvNpicQRF0TqIdKKRma6tZOWqT1Jw05ks5r8oVLgaJpZM4NheG9 .

AStupidBear commented 7 years ago

This issue also occurs for Titan X Pascal.

klowrey commented 7 years ago

If it helps, I was having a similar issue on a Maxwell Titan X after updating Knet.

Ubuntu 16.04 Cuda 8.0 Nvidia Driver 375.66

Packages CUDABLAS and CUDAdrv were failing due to not having gcc-4.7 and g++-4.7 installed on my system, so this probably affected Knet in some way.

apt install gcc-4.7 g++-4.7 # and recommended packages

then make clean; make in Knet/src directory fixed things for me (but Pkg.rm- add- build-("Knet") didnt...)

Thanks, Knet team!

denizyuret commented 7 years ago

The problem compiler options have been removed in latest release.