floydhub / dl-docker

An all-in-one Docker image for deep learning. Contains all the popular DL frameworks (TensorFlow, Theano, Torch, Caffe, etc.)
https://www.floydhub.com
3.86k stars 821 forks source link

Error with torch + cudnn #33

Open bindatype opened 7 years ago

bindatype commented 7 years ago

Hi Sai, When running the torch/test.sh script torch complains that:

*/root/torch/install/bin/luajit: /root/torch/install/share/lua/5.1/cudnn/ffi.lua:1278: 'libcudnn (R4) not found in library path. Please install CuDNN from https://developer.nvidia.com/cuDNN Then make sure files named as libcudnn.so.4 or libcudnn.4.dylib are placed in your library load path (for example /usr/local/lib , or manually add a path to LD_LIBRARY_PATH) stack traceback: [C]: in function 'error' /root/torch/install/share/lua/5.1/cudnn/ffi.lua:1278: in main chunk [C]: in function 'require' /root/torch/install/share/lua/5.1/cudnn/init.lua:4: in main chunk [C]: at 0x0046b7a0 [C]: at 0x00406670 I look for libcudnn4 but instead I find libcudnn5: root@eaf76f3:~/torch# find / -name "libcudnn" /var/lib/dpkg/info/libcudnn5-dev.postinst /var/lib/dpkg/info/libcudnn5-dev.md5sums /var/lib/dpkg/info/libcudnn5-dev.prerm /var/lib/dpkg/info/libcudnn5-dev.list /usr/share/lintian/overrides/libcudnn5-dev /usr/share/doc/libcudnn5-dev /usr/lib/x86_64-linux-gnu/libcudnn_static_v5.a /usr/local/cuda-8.0/targets/x86_64-linux/lib/libcudnn.so.5.1.5 /usr/local/cuda-8.0/targets/x86_64-linux/lib/libcudnn.so /usr/local/cuda-8.0/targets/x86_64-linux/lib/libcudnn_static.a /usr/local/cuda-8.0/targets/x86_64-linux/lib/libcudnn.so.5**

Maybe this isn't a conflict but when I update by LD_LIBRARY_PATH I still get errors: **cunn loaded succesfully cudnn loaded succesfully seed: 1484708833 Running 172 tests 1/172 tan2 ............................................................ [WAIT]THCudaCheck FAIL file=/tmp/luarocks_cutorch-scm-1-5894/cutorch/lib/THC/generated/../generic/THCTensorMathPointwise.cu line=54 error=8 : invalid device function 1/172 tan2 ............................................................ [ERROR] 2/172 cat ............................................................. [ERROR] 3/172 neg1 ............................................................ [ERROR] 4/172 atan2 ........................................................... [ERROR] 5/172 acos1 ........................................................... [ERROR] 6/172 restore_rng ..................................................... [FAIL] 7/172 streamWaitFor ................................................... [ERROR] 8/172 acos2 ........................................................... [ERROR] 9/172 zero ............................................................ [PASS] 10/172 logNormal ....................................................... [ERROR] 11/172 sqrt2 ........................................................... [ERROR] 12/172 pow1 ............................................................ [ERROR] 13/172 permute ......................................................... [PASS] 14/172 cross ........................................................... [PASS] 15/172 tensorToTable ................................................... [PASS] 16/172 inverse ......................................................... [ERROR] 17/172 sinh1 ........................................................... [ERROR] 18/172 sign1 ........................................................... [ERROR] 19/172 min ............................................................. [ERROR] 20/172 clamp4 .......................................................... [ERROR] 21/172 ger ............................................................. [PASS] 22/172 addmm ........................................................... [ERROR] 23/172 addmv ........................................................... [PASS] 24/172 random_seed ..................................................... [FAIL] 25/172 lerp ............................................................ [ERROR] 26/172 sigmoid2 ........................................................ [ERROR] 27/172 log1p1 .......................................................... [ERROR] 28/172 sinh2 ........................................................... [ERROR] 29/172 bernoulli ....................................................... [ERROR] 30/172 cremainder ...................................................... [PASS] 31/172 round1 .......................................................... [ERROR] 32/172 geometric ....................................................... [ERROR] 33/172 maskedCopy ...................................................... [ERROR] 34/172 indexCopy ....................................................... [ERROR] 35/172 frac2 ........................................................... [ERROR] 36/172 baddbmm ......................................................... [ERROR] 37/172 floor2 .......................................................... [ERROR] 38/172 sum ............................................................. [ERROR] 39/172 indexCopy2 ...................................................... [ERROR] 40/172 cudaTypeCopy .................................................... [ERROR] 41/172 cmul ............................................................ [ERROR] 42/172 streamBarrier ................................................... [ERROR] 43/172 cos1 ............................................................ [ERROR] 44/172 indexAdd ........................................................ [ERROR] 45/172 isSetTo ......................................................... [PASS] 46/172 mean ............................................................ [ERROR] 47/172 multinomial_without_replacement_gets_all ........................ [ERROR] 48/172 logicalTensor ................................................... [ERROR] 49/172 cmin ............................................................ [ERROR] 50/172 sin1 ............................................................ [ERROR] 51/172 mm .............................................................. [ERROR] 52/172 ceil2 ........................................................... [ERROR] 53/172 cinv1 ........................................................... [ERROR] 54/172 multinomial_without_replacement ................................. [ERROR] 55/172 triu ............................................................ [ERROR] 56/172 repeatTensor .................................................... [ERROR] 57/172 round2 .......................................................... [ERROR] 58/172 get_device ...................................................... [PASS] 59/172 elementSize ..................................................... [PASS] 60/172 log2 ............................................................ [ERROR] 61/172 cosh1 ........................................................... [ERROR] 62/172 csub ............................................................ [ERROR] 63/172 index ........................................................... [ERROR] 64/172 log1p2 .......................................................... [ERROR] 65/172 viewAs .......................................................... [ERROR] 66/172 add ............................................................. [ERROR] 67/172 std ............................................................. [ERROR] 68/172 log1 ............................................................ [ERROR] 69/172 kernelP2PAccess ................................................. [PASS] 70/172 tril ............................................................ [ERROR] 71/172 maskedSelect .................................................... [ERROR] 72/172 renorm .......................................................... [ERROR] 73/172 rsqrt ........................................................... [PASS] 74/172 scatter ......................................................... [ERROR] 75/172 addbmm .......................................................... [ERROR] 76/172 remainder ....................................................... [ERROR] 77/172 cinv2 ........................................................... [ERROR] 78/172 cudaHostTensor .................................................. [ERROR] 79/172 tanh1 ........................................................... [ERROR] 80/172 zeros ........................................................... [ERROR] 81/172 indexFill2 ...................................................... [ERROR] 82/172 copyNoncontiguous ............................................... [ERROR] 83/172 copyRandomizedTest .............................................. [ERROR] 84/172 streamWaitForMultiDevice ........................................ [PASS] 85/172 exp1 ............................................................ [ERROR] 86/172 copyAsync ....................................................... [ERROR] 87/172 trunc2 .......................................................... [ERROR] 88/172 clamp2 .......................................................... [ERROR] 89/172 dist ............................................................ [ERROR] 90/172 atan1 ........................................................... [ERROR] 91/172 multinomial_vector .............................................. [ERROR] 92/172 cudaStorageTypeCopy ............................................. [ERROR] 93/172 cdiv ............................................................ [ERROR] 94/172 trunc1 .......................................................... [ERROR] 95/172 catArrayBatched ................................................. [ERROR] 96/172 sqrt1 ........................................................... [ERROR] 97/172 clamp1 .......................................................... [ERROR] 98/172 cosh2 ........................................................... [ERROR] 99/172 cudaEvent ....................................................... [ERROR] 100/172 multi_gpu_copy_noncontig ........................................ [ERROR] 101/172 sign3 ........................................................... [ERROR] 102/172 equal ........................................................... [ERROR] 103/172 bmm ............................................................. [ERROR] 104/172 scatterFill ..................................................... [ERROR] 105/172 storageToTable .................................................. [ERROR] 106/172 pow2 ............................................................ [ERROR] 107/172 bmmTransposed ................................................... [ERROR] 108/172 sigmoid1 ........................................................ [ERROR] 109/172 normal .......................................................... [ERROR] 110/172 maskedFill ...................................................... [ERROR] 111/172 exp2 ............................................................ [ERROR] 112/172 topk ............................................................ [ERROR] 113/172 allAndAny ....................................................... [ERROR] 114/172 var ............................................................. [ERROR] 115/172 trace ........................................................... [ERROR] 116/172 catArray ........................................................ [ERROR] 117/172 cumsum .......................................................... [ERROR] 118/172 cauchy .......................................................... [ERROR] 119/172 chunk ........................................................... [ERROR] 120/172 abs2 ............................................................ [ERROR] 121/172 multi_gpu_random ................................................ [ERROR] 122/172 indexFill ....................................................... [ERROR] 123/172 sin2 ............................................................ [ERROR] 124/172 squeeze ......................................................... [ERROR] 125/172 abs1 ............................................................ [ERROR] 126/172 floor1 .......................................................... [ERROR] 127/172 cdiv3 ........................................................... [ERROR] 128/172 reshape ......................................................... [ERROR] 129/172 addcmul ......................................................... [ERROR] 130/172 uniform ......................................................... [ERROR] 131/172 isSize .......................................................... [PASS] 132/172 baddbmmTransposed ............................................... [ERROR] 133/172 logicalValue .................................................... [ERROR] 134/172 tan1 ............................................................ [ERROR] 135/172 asin2 ........................................................... [ERROR] 136/172 norm ............................................................ [ERROR] 137/172 prod ............................................................ [ERROR] 138/172 largeNoncontiguous .............................................. [ERROR] 139/172 isSameSizeAs .................................................... [PASS] 140/172 powExponentTensor ............................................... [ERROR] 141/172 gather .......................................................... [ERROR] 142/172 cumprod ......................................................... [ERROR] 143/172 cpow ............................................................ [ERROR] 144/172 clamp3 .......................................................... [ERROR] 145/172 max ............................................................. [ERROR] 146/172 nonzero ......................................................... [ERROR] 147/172 cmax ............................................................ [ERROR] 148/172 fmod ............................................................ [ERROR] 149/172 expand .......................................................... [ERROR] 150/172 asin1 ........................................................... [ERROR] 151/172 neg2 ............................................................ [ERROR] 152/172 cos2 ............................................................ [ERROR] 153/172 frac1 ........................................................... [ERROR] 154/172 sign2 ........................................................... [ERROR] 155/172 view ............................................................ [ERROR] 156/172 diag ............................................................ [ERROR] 157/172 fill ............................................................ [ERROR] 158/172 split ........................................................... [ERROR] 159/172 indexSelect2 .................................................... [ERROR] 160/172 addcdiv ......................................................... [ERROR] 161/172 streamBarrierMultiDevice ........................................ [PASS] 162/172 mv .............................................................. [ERROR] 163/172 cfmod ........................................................... [ERROR] 164/172 addr ............................................................ [ERROR] 165/172 tanh2 ........................................................... [ERROR] 166/172 exponential ..................................................... [ERROR] 167/172 ones ............................................................ [ERROR] 168/172 indexAddHalf .................................................... [ERROR] 169/172 sort ............................................................ [ERROR] 170/172 indexAdd2 ....................................................... [ERROR] 171/172 ceil1 ........................................................... [ERROR] 172/172 multinomial_with_replacement .................................... [ERROR]** Completed 344 asserts in 172 tests with 2 failures and 154 errors Any idea what can be the problem? Your help is greatly appreciated. Thanks.

ShanLuo commented 7 years ago

Hi @bindatype , have you solved the issue? I got the same problem and have got stuck for a while..

manneshiva commented 5 years ago

bumped into this same issue, were you able to find a fix @ShanLuo ?