hughperkins / distro-cl

OpenCL Torch
147 stars 17 forks source link

core dump on clnn.test 31/74 SpatialConvolutionMM_forward_single_vgglayer13 #38

Closed sawtl closed 7 years ago

sawtl commented 7 years ago

Hello, clnn have work once, but I have had to rebuild my Ubuntu. I have reinstalled all GPU and opencl, but in clnn test 31 I got this:

31/74 SpatialConvolutionMM_forward_single_vgglayer13 "Segmentation error core dump"

I am lost in solving this case. Any help or idea ?

Thanks in advance,

Steve

hughperkins commented 7 years ago

Hmmm. Nothing springs to mind just like that. Perhaps you can paste the entire output into https://gist.github.com ?

sawtl commented 7 years ago

Hi,

Here is the complete output:

stl@stl-lab2:~$ luajit -l clnn -e 'clnn.test()' libthclnn_searchpath /home/stl/torch-cl/install/lib/lua/5.1/libTHCLNN.so Running 74 tests 1/74 Square_transposed ................................................. [WAIT]Using Advanced Micro Devices, Inc. , OpenCL platform: AMD Accelerated Parallel Processing Using OpenCL device: Tahiti 1/74 Square_transposed ................................................. [PASS] 2/74 TemporalConvolution2_forward ...................................... [PASS] 3/74 SpatialMaxPooling_forward ......................................... [PASS] 4/74 SoftMax_forward_batch ............................................. [PASS] 5/74 Sigmoid_forward ................................................... [PASS] 6/74 ELU_backward ...................................................... [PASS] 7/74 Threshold_forward ................................................. [PASS] 8/74 Threshold_backward_inplace ........................................ [PASS] 9/74 Tanh_transposed ................................................... [PASS] 10/74 SpatialUpSamplingNearest_forward_batch ............................ [PASS] 11/74 Sigmoid_transposed ................................................ [PASS] 12/74 ClassNLLCriterionSingleTarget ..................................... [PASS] 13/74 mse_variablebatchsize ............................................. [WAIT]THClReduceAll.cl build log: "/tmp/OCLmNo2Jt.cl", line 51: warning: function "IndexToOffset_999_get" was declared but never referenced static inline unsigned int IndexToOffset_999_get(unsigned int linearId, global const TensorInfoCl *info) { ^

"/tmp/OCLmNo2Jt.cl", line 66: warning: function "getLinearBlockId" was declared but never referenced static inline unsigned int getLinearBlockId() { ^

13/74 mse_variablebatchsize ............................................. [PASS] 14/74 LogSigmoid_transposed ............................................. [PASS] 15/74 ClassNLLCriterionMultipleTarget ................................... [WAIT]THClReduceAll.cl build log: "/tmp/OCL4QUoFd.cl", line 51: warning: function "IndexToOffset_999_get" was declared but never referenced static inline unsigned int IndexToOffset_999_get(unsigned int linearId, global const TensorInfoCl *info) { ^

"/tmp/OCL4QUoFd.cl", line 66: warning: function "getLinearBlockId" was declared but never referenced static inline unsigned int getLinearBlockId() { ^

THClReduceAll.cl build log: "/tmp/OCLATvaVr.cl", line 9: warning: variable "in1" was declared but never referenced float *in1 = &_in1; ^

"/tmp/OCLATvaVr.cl", line 10: warning: variable "out" was declared but never referenced float *out = &_out; ^

"/tmp/OCLATvaVr.cl", line 51: warning: function "IndexToOffset_999_get" was declared but never referenced static inline unsigned int IndexToOffset_999_get(unsigned int linearId, global const TensorInfoCl *info) { ^

"/tmp/OCLATvaVr.cl", line 66: warning: function "getLinearBlockId" was declared but never referenced static inline unsigned int getLinearBlockId() { ^

15/74 ClassNLLCriterionMultipleTarget ................................... [PASS] 16/74 SoftMax_forward ................................................... [PASS] 17/74 LogSoftMax_forward ................................................ [WAIT]THClReduce.cl build log: "/tmp/OCLwBOghI.cl", line 48: warning: function "IndexToOffset_999_get" was declared but never referenced static inline int IndexToOffset_999_get(int linearId, global const TensorInfoCl *info) { ^

17/74 LogSoftMax_forward ................................................ [PASS] 18/74 Tanh_forward ...................................................... [PASS] 19/74 CMul_forward_batch ................................................ [PASS] 20/74 Threshold_backward ................................................ [PASS] 21/74 mse ............................................................... [PASS] 22/74 SpatialAveragePooling_backward_batch .............................. [PASS] 23/74 ELU_forward ....................................................... [PASS] 24/74 Square_backward ................................................... [PASS] 25/74 SpatialMaxPooling_forward_batch_ceil .............................. [PASS] 26/74 LogSigmoid_backward ............................................... [PASS] 27/74 SpatialMaxPooling_backward_batch_ceil ............................. [PASS] 28/74 Sqrt_transposed ................................................... [PASS] 29/74 LookupTable_forward ............................................... [PASS] 30/74 ClassNLLCriterionSingleTargetScalar ............................... [PASS] 31/74 SpatialConvolutionMM_forward_single_vgglayer13 .................... [WAIT]Erreur de segmentation (core dumped) stl@stl-lab2:~$

Warnings arose in some other tests but whitout failure.

Best regards,

Steve

sawtl commented 7 years ago

Hi,

I have redone the install the same way and retest with my old Radeon 5970 card. And all tests pass ok! Maybe it was a tricky configuration issue with the R9 280X I test at first. I close the issue. Thanks for all, Torch-cl is great!

hughperkins commented 7 years ago

Great! :)