Open tylerlindell opened 7 years ago
i dont remember this symbol. might not be implemented. can you grep through the cltorch
sourcecode, and see if it exists? (grep -r stdall *
)
On 10 April 2017 07:10:51 CEST, TylerLindell notifications@github.com wrote:
i'm getting the following error when using
trainset.data[{ {}, {i}, {}, {} }]:div(stdv[i]) -- std scaling
dyld: lazy symbol binding failed: Symbol not found: _THClTensor_stdall Referenced from: ~/torch-cl/install/lib/lua/5.1/libcltorch.so Expected in: flat namespace dyld: Symbol not found: _THClTensor_stdall Referenced from: ~/torch-cl/install/lib/lua/5.1/libcltorch.so Expected in: flat namespace Trace/BPT trap: 5
the code i'm using is here:
--///////////////////////////////////////////////////////////////////////////// require 'torch' require 'nn' --///////////////////////////////////////////////////////////////////////////// require 'cltorch' require 'clnn' -- require 'cunn'; --///////////////////////////////////////////////////////////////////////////// require 'paths' if (not paths.filep("cifar10torchsmall.zip")) then os.execute('wget -c https://s3.amazonaws.com/torch7/data/cifar10torchsmall.zip') os.execute('unzip cifar10torchsmall.zip') end trainset = torch.load('cifar10-train.t7') testset = torch.load('cifar10-test.t7') classes = {'airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck'} --///////////////////////////////////////////////////////////////////////////// print(trainset) print(#trainset.data) --///////////////////////////////////////////////////////////////////////////// -- itorch.image(trainset.data[100]) -- display the 100-th image in dataset print(classes[trainset.label[100]]) --///////////////////////////////////////////////////////////////////////////// -- ignore setmetatable for now, it is a feature beyond the scope of this tutorial. It sets the index operator. setmetatable(trainset, {__index = function(t, i) return {t.data[i], t.label[i]} end} ); trainset.data = trainset.data:double() -- convert the data from a ByteTensor to a DoubleTensor. trainset.data = trainset.data:cl() trainset.label = trainset.label:cl() -- trainset.data = trainset.data:cuda() -- trainset.label = trainset.label:cuda() function trainset:size() return self.data:size(1) end --///////////////////////////////////////////////////////////////////////////// print(trainset:size()) -- just to test --///////////////////////////////////////////////////////////////////////////// print(trainset[33]) -- load sample number 33. -- itorch.image(trainset[33][1]) --///////////////////////////////////////////////////////////////////////////// redChannel = trainset.data[{ {}, {1}, {}, {} }] -- this picks {all images, 1st channel, all vertical pixels, all horizontal pixels} --///////////////////////////////////////////////////////////////////////////// print(#redChannel) --///////////////////////////////////////////////////////////////////////////// mean = {} -- store the mean, to normalize the test set in the future stdv = {} -- store the standard-deviation for the future for i=1,3 do -- over each image channel mean[i] = trainset.data[{ {}, {i}, {}, {} }]:mean() -- mean estimation print('Channel ' .. i .. ', Mean: ' .. mean[i]) trainset.data[{ {}, {i}, {}, {} }]:add(-mean[i]) -- mean subtraction stdv[i] = trainset.data[{ {}, {i}, {}, {} }]:std() -- std estimation print('Channel ' .. i .. ', Standard Deviation: ' .. stdv[i]) trainset.data[{ {}, {i}, {}, {} }]:div(stdv[i]) -- std scaling end --///////////////////////////////////////////////////////////////////////////// net = nn.Sequential() net = net:cl() -- net = net:cuda() net:add(nn.SpatialConvolution(3, 6, 5, 5)) -- 3 input image channels, 6 output channels, 5x5 convolution kernel net:add(nn.ReLU()) -- non-linearity net:add(nn.SpatialMaxPooling(2,2,2,2)) -- A max-pooling operation that looks at 2x2 windows and finds the max. net:add(nn.SpatialConvolution(6, 16, 5, 5)) net:add(nn.ReLU()) -- non-linearity net:add(nn.SpatialMaxPooling(2,2,2,2)) net:add(nn.View(16*5*5)) -- reshapes from a 3D tensor of 16x5x5 into 1D tensor of 16*5*5 net:add(nn.Linear(16*5*5, 120)) -- fully connected layer (matrix multiplication between input and weights) net:add(nn.ReLU()) -- non-linearity net:add(nn.Linear(120, 84)) net:add(nn.ReLU()) -- non-linearity net:add(nn.Linear(84, 10)) -- 10 is the number of outputs of the network (in this case, 10 digits) net:add(nn.LogSoftMax()) -- converts the output to a log-probability. Useful for classification problems --///////////////////////////////////////////////////////////////////////////// criterion = nn.ClassNLLCriterion() criterion = criterion:cl() -- criterion = criterion:cuda() --///////////////////////////////////////////////////////////////////////////// trainer = nn.StochasticGradient(net, criterion) trainer.learningRate = 0.001 trainer.maxIteration = 5 -- just do 5 epochs of training. --///////////////////////////////////////////////////////////////////////////// trainer:train(trainset) --///////////////////////////////////////////////////////////////////////////// print(classes[testset.label[100]]) -- itorch.image(testset.data[100]) --///////////////////////////////////////////////////////////////////////////// testset.data = testset.data:double() -- convert from Byte tensor to Double tensor for i=1,3 do -- over each image channel testset.data[{ {}, {i}, {}, {} }]:add(-mean[i]) -- mean subtraction testset.data[{ {}, {i}, {}, {} }]:div(stdv[i]) -- std scaling end --///////////////////////////////////////////////////////////////////////////// -- for fun, print the mean and standard-deviation of example-100 horse = testset.data[100] print(horse:mean(), horse:std()) --///////////////////////////////////////////////////////////////////////////// print(classes[testset.label[100]]) -- itorch.image(testset.data[100]) predicted = net:forward(testset.data[100]) --///////////////////////////////////////////////////////////////////////////// -- the output of the network is Log-Probabilities. To convert them to probabilities, you have to take e^x print(predicted:exp()) --///////////////////////////////////////////////////////////////////////////// for i=1,predicted:size(1) do print(classes[i], predicted[i]) end --///////////////////////////////////////////////////////////////////////////// correct = 0 for i=1,10000 do local groundtruth = testset.label[i] local prediction = net:forward(testset.data[i]) local confidences, indices = torch.sort(prediction, true) -- true means sort in descending order if groundtruth == indices[1] then correct = correct + 1 end end --///////////////////////////////////////////////////////////////////////////// print(correct, 100*correct/10000 .. ' % ') --///////////////////////////////////////////////////////////////////////////// class_performance = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0} for i=1,10000 do local groundtruth = testset.label[i] local prediction = net:forward(testset.data[i]) local confidences, indices = torch.sort(prediction, true) -- true means sort in descending order if groundtruth == indices[1] then class_performance[groundtruth] = class_performance[groundtruth] + 1 end end --///////////////////////////////////////////////////////////////////////////// for i=1,#classes do print(classes[i], 100*class_performance[i]/1000 .. ' %') end
-- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/hughperkins/distro-cl/issues/27
-- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Thank you @hughperkins, here is what was returned after running that command
Binary file extra/cutorch/build/CMakeFiles/cutorch.dir/TensorMath.c.o matches
Binary file extra/cutorch/build/lib/THC/CMakeFiles/THC.dir/THC_generated_THCTensorMath2.cu.o matches
Binary file extra/cutorch/build/lib/THC/libTHC.dylib matches
Binary file extra/cutorch/build/libcutorch.so matches
extra/cutorch/build/TensorMath.c:arg2 = THCudaTensor_stdall(default_arg1,arg1);
extra/cutorch/build/TensorMath.c:arg2 = THCudaTensor_stdall(default_arg1,arg1);
extra/cutorch/lib/THC/THCTensorMath.h:THC_API float THCudaTensor_stdall(THCState *state, THCudaTensor *self);
extra/cutorch/lib/THC/THCTensorMath2.cu:float THCudaTensor_stdall(THCState *state, THCudaTensor *self)
install/include/TH/generic/THTensorMath.c:accreal THTensor_(stdall)(THTensor *tensor)
install/include/TH/generic/THTensorMath.h:TH_API accreal THTensor_(stdall)(THTensor *self);
install/include/THC/THCTensorMath.h:THC_API float THCudaTensor_stdall(THCState *state, THCudaTensor *self);
install/include/THCl/THClTensorMath.h:THCL_API float THClTensor_stdall(THClState *state, THClTensor *self);
Binary file install/lib/libTH.dylib matches
Binary file install/lib/libTHC.dylib matches
Binary file install/lib/lua/5.1/libcltorch.so matches
Binary file install/lib/lua/5.1/libcutorch.so matches
Binary file install/lib/lua/5.1/libtorch.so matches
Binary file opencl/cltorch/build/CMakeFiles/cltorch.dir/TensorMath.c.o matches
Binary file opencl/cltorch/build/libcltorch.so matches
opencl/cltorch/build/TensorMath.c:arg2 = THClTensor_stdall(default_arg1,arg1);
opencl/cltorch/build/TensorMath.c:arg2 = THClTensor_stdall(default_arg1,arg1);
opencl/cltorch/src/lib/THClTensorMath.h:THCL_API float THClTensor_stdall(THClState *state, THClTensor *self);
opencl/cltorch/src/lib/THClTensorMath2.cpp:float THClTensor_stdall(THClState *state, THClTensor *self)
Binary file pkg/torch/build/CMakeFiles/torch.dir/TensorMath.c.o matches
Binary file pkg/torch/build/lib/TH/CMakeFiles/TH.dir/THTensor.c.o matches
Binary file pkg/torch/build/lib/TH/libTH.dylib matches
Binary file pkg/torch/build/libtorch.so matches
pkg/torch/build/TensorMath.c:arg2 = THFloatTensor_stdall(arg1);
pkg/torch/build/TensorMath.c:arg2 = THFloatTensor_stdall(arg1);
pkg/torch/build/TensorMath.c:arg2 = THDoubleTensor_stdall(arg1);
pkg/torch/build/TensorMath.c:arg2 = THDoubleTensor_stdall(arg1);
pkg/torch/lib/TH/generic/THTensorMath.c:accreal THTensor_(stdall)(THTensor *tensor)
pkg/torch/lib/TH/generic/THTensorMath.h:TH_API accreal THTensor_(stdall)(THTensor *self);
ok, looks like there might be an implemetnation in opencl/cltorch/src/lib/THClTensorMath2.cpp
. Can you see if that contains any definitions of THClTensor_stdall
? Also, if you run the unit tests, do they work ok for you?
it is commented out in that file but here is an image of all the places it shows up in ~/torch-cl/
Here are the test results for unit tests:
$ luajit -l torch -e 'torch.test()'
Running 145 tests
1/145 tanh ............................................................ [PASS]
2/145 testCholesky .................................................... [PASS]
3/145 multinomialvector ............................................... [PASS]
4/145 log ............................................................. [PASS]
5/145 sigmoid ......................................................... [PASS]
6/145 permute ......................................................... [PASS]
7/145 cross ........................................................... [PASS]
8/145 gels_reuse ...................................................... [PASS]
9/145 inverse ......................................................... [PASS]
10/145 rangeequalbounds ................................................ [PASS]
11/145 min ............................................................. [PASS]
12/145 prod ............................................................ [PASS]
13/145 gesv_reuse ...................................................... [PASS]
14/145 atan ............................................................ [PASS]
15/145 rangefloat ...................................................... [PASS]
16/145 isSize .......................................................... [PASS]
17/145 eig_noncontig ................................................... [PASS]
18/145 gatherMax ....................................................... [PASS]
19/145 histc ........................................................... [PASS]
20/145 pstrf ........................................................... [PASS]
21/145 kthvalue ........................................................ [PASS]
22/145 multinomialwithreplacement ...................................... [PASS]
23/145 maskedCopy ...................................................... [PASS]
24/145 maskedFill ...................................................... [PASS]
25/145 pow ............................................................. [PASS]
26/145 fxcorr3_fxcorr2_eq .............................................. [PASS]
27/145 isTypeOfInheritance ............................................. [PASS]
28/145 linspace ........................................................ [PASS]
29/145 testheaptracking ................................................ [PASS]
30/145 sum ............................................................. [PASS]
31/145 gels_uniquely_determined ........................................ [PASS]
32/145 allAndAny1 ...................................................... [PASS]
33/145 cmul ............................................................ [PASS]
34/145 trtrs_reuse ..................................................... [PASS]
35/145 topK ............................................................ [PASS]
36/145 newIndex ........................................................ [PASS]
37/145 exp ............................................................. [PASS]
38/145 multinomialwithoutreplacement ................................... [PASS]
39/145 mm .............................................................. [PASS]
40/145 sortDescending .................................................. [PASS]
41/145 triu ............................................................ [PASS]
42/145 repeatTensor .................................................... [PASS]
43/145 isTensor ........................................................ [PASS]
44/145 mul ............................................................. [PASS]
45/145 sqrt ............................................................ [PASS]
46/145 floor ........................................................... [PASS]
47/145 elementSize ..................................................... [PASS]
48/145 rangedouble ..................................................... [PASS]
49/145 csub ............................................................ [PASS]
50/145 gesv ............................................................ [PASS]
51/145 cos ............................................................. [PASS]
52/145 index ........................................................... [PASS]
53/145 gels_underdetermined ............................................ [PASS]
54/145 add ............................................................. [PASS]
55/145 conv3 ........................................................... [PASS]
56/145 gels_overdetermined ............................................. [PASS]
57/145 tril ............................................................ [PASS]
58/145 maskedSelect .................................................... [PASS]
59/145 renorm .......................................................... [PASS]
60/145 eig_reuse ....................................................... [PASS]
61/145 addbmm .......................................................... [PASS]
62/145 sin_2 ........................................................... [PASS]
63/145 symeig_noncontig ................................................ [PASS]
64/145 clamp ........................................................... [PASS]
65/145 logical ......................................................... [PASS]
66/145 cmax ............................................................ [PASS]
67/145 median .......................................................... [PASS]
68/145 cosh ............................................................ [PASS]
69/145 max ............................................................. [PASS]
70/145 csub_scalar ..................................................... [PASS]
71/145 xcorr3_xcorr2_eq ................................................ [PASS]
72/145 scatterFill ..................................................... [PASS]
73/145 eig ............................................................. [PASS]
74/145 classNoModule ................................................... [PASS]
75/145 mod ............................................................. [PASS]
76/145 bmm ............................................................. [PASS]
77/145 svd_reuse ....................................................... [PASS]
78/145 randperm ........................................................ [PASS]
79/145 classInModule ................................................... [PASS]
80/145 nonzero ......................................................... [PASS]
81/145 testBoxMullerState .............................................. [PASS]
82/145 dot ............................................................. [PASS]
83/145 allAndAny2 ...................................................... [PASS]
84/145 trtrs ........................................................... [PASS]
85/145 storageview ..................................................... [PASS]
86/145 rand ............................................................ [PASS]
87/145 zeros ........................................................... [PASS]
88/145 potrs ........................................................... [PASS]
89/145 randn ........................................................... [PASS]
90/145 sinh ............................................................ [PASS]
91/145 abs ............................................................. [PASS]
92/145 sortAscending ................................................... [PASS]
93/145 cinv ............................................................ [PASS]
94/145 indexCopy ....................................................... [PASS]
95/145 cpow ............................................................ [PASS]
96/145 neg ............................................................. [PASS]
97/145 scatter ......................................................... [PASS]
98/145 asin ............................................................ [PASS]
99/145 catArray ........................................................ [PASS]
100/145 RNGStateAliasing ................................................ [PASS]
101/145 ones ............................................................ [PASS]
102/145 div ............................................................. [PASS]
103/145 sin ............................................................. [PASS]
104/145 type ............................................................ [PASS]
105/145 baddbmm ......................................................... [PASS]
106/145 conv2 ........................................................... [PASS]
107/145 mode ............................................................ [PASS]
108/145 svd_noncontig ................................................... [PASS]
109/145 isSameSizeAs .................................................... [PASS]
110/145 ceil ............................................................ [PASS]
111/145 conv3_conv2_eq .................................................. [PASS]
112/145 isTypeOfComposite ............................................... [PASS]
113/145 totable ......................................................... [PASS]
114/145 svd ............................................................. [PASS]
115/145 isStorage ....................................................... [PASS]
116/145 logspace ........................................................ [PASS]
117/145 isTypeOfPartial ................................................. [PASS]
118/145 isSetTo ......................................................... [PASS]
119/145 tan ............................................................. [PASS]
120/145 serialize ....................................................... [PASS]
121/145 RNGState ........................................................ [PASS]
122/145 cumprod ......................................................... [PASS]
123/145 potri ........................................................... [PASS]
124/145 eye ............................................................. [PASS]
125/145 chunk ........................................................... [PASS]
126/145 split ........................................................... [PASS]
127/145 gather .......................................................... [PASS]
128/145 acos ............................................................ [PASS]
129/145 cmin ............................................................ [PASS]
130/145 testNumel ....................................................... [PASS]
131/145 expand .......................................................... [PASS]
132/145 indexAdd ........................................................ [PASS]
133/145 view ............................................................ [PASS]
134/145 reshape ......................................................... [PASS]
135/145 mv .............................................................. [PASS]
136/145 cumsum .......................................................... [PASS]
137/145 diag ............................................................ [PASS]
138/145 cat ............................................................. [PASS]
139/145 round ........................................................... [PASS]
140/145 range ........................................................... [PASS]
141/145 cdiv ............................................................ [PASS]
142/145 fconv3_fconv2_eq ................................................ [PASS]
143/145 test_symeig ..................................................... [PASS]
144/145 cmod ............................................................ [PASS]
145/145 rangenegative ................................................... [PASS]
Completed 1120 asserts in 145 tests with 0 failures and 0 errors
$ luajit -l nn -e 'nn.test()'
Seed: 1491880542
Running 145 tests
1/145 VolumetricMaxUnpooling .......................................... [PASS]
2/145 ConcatTable ..................................................... [PASS]
3/145 SpatialAveragePooling ........................................... [PASS]
4/145 Module_getParameters_8 .......................................... [PASS]
5/145 tostringnnSpatialZeroPadding .................................... [PASS]
6/145 BCECriterion .................................................... [PASS]
7/145 ELUIP ........................................................... [PASS]
8/145 SparseLinear .................................................... [PASS]
9/145 SpatialCrossMapLRN .............................................. [PASS]
10/145 VolumetricConvolutionBatchCompare ............................... [PASS]
11/145 PairwiseDistance ................................................ [PASS]
12/145 WeightedMSECriterion ............................................ [PASS]
13/145 SelectTable ..................................................... [PASS]
14/145 SpatialLPPooling ................................................ [PASS]
15/145 SpatialDropoutBatch ............................................. [PASS]
16/145 MixtureTable .................................................... [PASS]
17/145 SpatialFullConvolutionMap ....................................... [PASS]
18/145 Module_getParameters_5 .......................................... [PASS]
19/145 Min ............................................................. [PASS]
20/145 Exp ............................................................. [PASS]
21/145 Add ............................................................. [PASS]
22/145 Module_listModules .............................................. [PASS]
23/145 SpatialConvolutionLocal ......................................... [PASS]
24/145 BatchNormalization .............................................. [PASS]
25/145 MultiCriterion .................................................. [PASS]
26/145 Module_apply .................................................... [PASS]
27/145 Max ............................................................. [PASS]
28/145 MulConstant ..................................................... [PASS]
29/145 NarrowTable ..................................................... [PASS]
30/145 View ............................................................ [PASS]
31/145 VolumetricConvolution ........................................... [PASS]
32/145 tostringnnReshape ............................................... [PASS]
33/145 SpatialSubSampling .............................................. [PASS]
34/145 HardTanh ........................................................ [PASS]
35/145 DistKLDivCriterion .............................................. [PASS]
36/145 SplitTable ...................................................... [PASS]
37/145 DotProduct ...................................................... [PASS]
38/145 HingeEmbeddingCriterion ......................................... [PASS]
39/145 SpatialBatchNormalization ....................................... [PASS]
40/145 DepthConcat ..................................................... [PASS]
41/145 Sigmoid ......................................................... [PASS]
42/145 SpatialAdaptiveMaxPooling ....................................... [PASS]
43/145 Parallel ........................................................ [PASS]
44/145 SoftShrink ...................................................... [PASS]
45/145 Module_getParameters_1 .......................................... [PASS]
46/145 Log ............................................................. [PASS]
47/145 SpatialDropout .................................................. [PASS]
48/145 LeakyReLU ....................................................... [PASS]
49/145 VolumetricMaxPooling ............................................ [PASS]
50/145 Linear .......................................................... [PASS]
51/145 Module_getParameters_12 ......................................... [PASS]
52/145 Euclidean ....................................................... [PASS]
53/145 SpatialMaxPooling ............................................... [PASS]
54/145 MultiMarginCriterion ............................................ [PASS]
55/145 LogSoftmax ...................................................... [PASS]
56/145 ELU ............................................................. [PASS]
57/145 Softmax ......................................................... [PASS]
58/145 LogSigmoid ...................................................... [PASS]
59/145 Copy ............................................................ [PASS]
60/145 VolumetricAveragePooling ........................................ [PASS]
61/145 SpatialContrastiveNormalization ................................. [PASS]
62/145 Bilinear ........................................................ [PASS]
63/145 Softmin ......................................................... [PASS]
64/145 Padding ......................................................... [PASS]
65/145 Module_getParameters_2 .......................................... [PASS]
66/145 VolumetricFullConvolution_simple_test ........................... [PASS]
67/145 MarginRankingCriterion .......................................... [PASS]
68/145 VolumetricFullConvolution ....................................... [PASS]
69/145 CrossEntropyCriterion ........................................... [PASS]
70/145 SpatialSubtractiveNormalization_1dkernel ........................ [PASS]
71/145 SpatialSoftMax .................................................. [PASS]
72/145 HardShrink ...................................................... [PASS]
73/145 SpatialSubSamplingBatchCompare .................................. [PASS]
74/145 Abs ............................................................. [PASS]
75/145 Softsign ........................................................ [PASS]
76/145 WeightedEuclidean ............................................... [PASS]
77/145 addSingletonDimension ........................................... [PASS]
78/145 Module_getParameters_10 ......................................... [PASS]
79/145 L1Cost .......................................................... [PASS]
80/145 PReLU ........................................................... [PASS]
81/145 JoinTable ....................................................... [PASS]
82/145 SpatialFullConvolutionCompare ................................... [PASS]
83/145 CMul ............................................................ [PASS]
84/145 CosineDistance .................................................. [PASS]
85/145 Index ........................................................... [PASS]
86/145 Mean ............................................................ [PASS]
87/145 SpatialConvolutionMM ............................................ [PASS]
88/145 Dropout ......................................................... [PASS]
89/145 BatchMMTransposeA ............................................... [PASS]
90/145 SoftPlus ........................................................ [PASS]
91/145 TemporalConvolution ............................................. [PASS]
92/145 Module_getParameters_11 ......................................... [PASS]
93/145 ParallelCriterion ............................................... [PASS]
94/145 SmoothL1Criterion ............................................... [PASS]
95/145 L1Penalty ....................................................... [PASS]
96/145 LookupTable ..................................................... [PASS]
97/145 SpatialMaxUnpooling ............................................. [PASS]
98/145 Sqrt ............................................................ [PASS]
99/145 LeakyReLUIP ..................................................... [PASS]
100/145 Module_getParameters_6 .......................................... [PASS]
101/145 FlattenTable .................................................... [PASS]
102/145 Square .......................................................... [PASS]
103/145 Module_getParameters_4 .......................................... [PASS]
104/145 SpatialDivisiveNormalization_1dkernel ........................... [PASS]
105/145 AddConstant ..................................................... [PASS]
106/145 BatchMMTransposeB ............................................... [PASS]
107/145 BatchMMNoTranspose .............................................. [PASS]
108/145 SpatialConvolutionBatchCompare .................................. [PASS]
109/145 Cosine .......................................................... [PASS]
110/145 Clamp ........................................................... [PASS]
111/145 VolumetricMaxPooling_boundary ................................... [PASS]
112/145 Power ........................................................... [PASS]
113/145 tostringnnLinear ................................................ [PASS]
114/145 TemporalMaxPooling .............................................. [PASS]
115/145 SpatialUpSamplingNearest ........................................ [PASS]
116/145 Sum ............................................................. [PASS]
117/145 Typecast ........................................................ [PASS]
118/145 Tanh ............................................................ [PASS]
119/145 Module_getParameters_3 .......................................... [PASS]
120/145 Threshold ....................................................... [PASS]
121/145 ParallelTable ................................................... [PASS]
122/145 SpatialFractionalMaxPooling_Ratio ............................... [PASS]
123/145 Module_getParameters_7 .......................................... [PASS]
124/145 ClassNLLCriterion ............................................... [PASS]
125/145 Select .......................................................... [PASS]
126/145 BatchMMTransposeBoth ............................................ [PASS]
127/145 SpatialFullConvolutionBatchCompare .............................. [PASS]
128/145 Normalize ....................................................... [PASS]
129/145 SpatialConvolution .............................................. [PASS]
130/145 GradientReversal ................................................ [PASS]
131/145 SpatialConvolutionMap ........................................... [PASS]
132/145 SpatialDivisiveNormalization_2dkernel ........................... [PASS]
133/145 Replicate ....................................................... [PASS]
134/145 CosineEmbeddingCriterion ........................................ [PASS]
135/145 MM .............................................................. [PASS]
136/145 SpatialFullConvolution .......................................... [PASS]
137/145 ReLU ............................................................ [PASS]
138/145 RReLU ........................................................... [PASS]
139/145 Reshape ......................................................... [PASS]
140/145 SpatialSubtractiveNormalization_2dkernel ........................ [PASS]
141/145 MSECriterion .................................................... [PASS]
142/145 MarginCriterion ................................................. [PASS]
143/145 Mul ............................................................. [PASS]
144/145 TemporalSubSampling ............................................. [PASS]
145/145 SpatialFractionalMaxPooling ..................................... [PASS]
Completed 2476 asserts in 145 tests with 0 failures and 0 errors and 1 warning
--------------------------------------------------------------------------------
Should use TestSuite rather than plain lua table
$ luajit -l cltorch -e 'cltorch.test()'
running tests...
aftter requiring cltorch.unit_storage
Running 2 tests
1/2 test_get ............................................................ [WAIT]
Using Apple , OpenCL platform: Apple
Using OpenCL device: Iris
1/2 test_get ............................................................ [PASS]
2/2 test_basic .......................................................... [WAIT]
2/2 test_basic .......................................................... [PASS]
Completed 15 asserts in 2 tests with 0 failures and 0 errors
#tester.errors 0
res true
aftter requiring cltorch.unit_tensor
Running 117 tests
1/117 outplace_div .................................................... [WAIT]
1/117 outplace_div .................................................... [PASS]
2/117 test_addcmul .................................................... [WAIT]
2/117 test_addcmul .................................................... [PASS]
3/117 outplace_tanh ................................................... [WAIT]
3/117 outplace_tanh ................................................... [PASS]
4/117 outplace_pow .................................................... [WAIT]
4/117 outplace_pow .................................................... [PASS]
5/117 inplace_tanh .................................................... [WAIT]
5/117 inplace_tanh .................................................... [PASS]
6/117 test_scatterFill ................................................ [WAIT]
6/117 test_scatterFill ................................................ [PASS]
7/117 inplace_acos .................................................... [WAIT]
7/117 inplace_acos .................................................... [PASS]
8/117 outplace_cpow ................................................... [WAIT]
8/117 outplace_cpow ................................................... [PASS]
9/117 inplace_atan .................................................... [WAIT]
9/117 inplace_atan .................................................... [PASS]
10/117 inplace_le ...................................................... [WAIT]
10/117 inplace_le ...................................................... [PASS]
11/117 test_equals ..................................................... [WAIT]
11/117 test_equals ..................................................... [PASS]
12/117 self_lt ......................................................... [WAIT]
12/117 self_lt ......................................................... [PASS]
13/117 inplace_round ................................................... [WAIT]
13/117 inplace_round ................................................... [PASS]
14/117 test_matrixwide ................................................. [WAIT]
14/117 test_matrixwide ................................................. [PASS]
15/117 inplace_sqrt .................................................... [WAIT]
15/117 inplace_sqrt .................................................... [PASS]
16/117 test_max2 ....................................................... [WAIT]
16/117 test_max2 ....................................................... [PASS]
17/117 test_prod ....................................................... [WAIT]
17/117 test_prod ....................................................... [PASS]
18/117 test_scatter .................................................... [WAIT]
18/117 test_scatter .................................................... [PASS]
19/117 inplace_cinv .................................................... [WAIT]
19/117 inplace_cinv .................................................... [PASS]
20/117 outplace_sin .................................................... [WAIT]
20/117 outplace_sin .................................................... [PASS]
21/117 outplace_ge ..................................................... [WAIT]
21/117 outplace_ge ..................................................... [PASS]
22/117 outplace_add .................................................... [WAIT]
22/117 outplace_add .................................................... [PASS]
23/117 test_basic ...................................................... [WAIT]
23/117 test_basic ...................................................... [PASS]
24/117 test_sub ........................................................ [WAIT]
24/117 test_sub ........................................................ [PASS]
25/117 outplace_cdiv ................................................... [WAIT]
25/117 outplace_cdiv ................................................... [PASS]
26/117 inplace_log ..................................................... [WAIT]
26/117 inplace_log ..................................................... [PASS]
27/117 test_reduceAll .................................................. [WAIT]
THClReduceAll.cl build log:
<program source>:9:10: warning: unused variable 'in1'
float *in1 = &_in1;
^
<program source>:10:10: warning: unused variable 'out'
float *out = &_out;
^
27/117 test_reduceAll .................................................. [PASS]
28/117 inplace_atan2 ................................................... [WAIT]
28/117 inplace_atan2 ................................................... [PASS]
29/117 test_intpower ................................................... [WAIT]
29/117 test_intpower ................................................... [PASS]
30/117 outplace_mul .................................................... [WAIT]
30/117 outplace_mul .................................................... [PASS]
31/117 operator_div_scalar ............................................. [WAIT]
31/117 operator_div_scalar ............................................. [PASS]
32/117 test_addcdivshape ............................................... [WAIT]
32/117 test_addcdivshape ............................................... [PASS]
33/117 test_min1 ....................................................... [WAIT]
33/117 test_min1 ....................................................... [PASS]
34/117 test_norm ....................................................... [WAIT]
34/117 test_norm ....................................................... [PASS]
35/117 self_eq ......................................................... [WAIT]
35/117 self_eq ......................................................... [PASS]
36/117 operator_plus ................................................... [WAIT]
36/117 operator_plus ................................................... [PASS]
37/117 inplace_cos ..................................................... [WAIT]
37/117 inplace_cos ..................................................... [PASS]
38/117 outplace_log .................................................... [WAIT]
38/117 outplace_log .................................................... [PASS]
39/117 outplace_asin ................................................... [WAIT]
39/117 outplace_asin ................................................... [PASS]
40/117 outplace_eq ..................................................... [WAIT]
40/117 outplace_eq ..................................................... [PASS]
41/117 outplace_gt ..................................................... [WAIT]
41/117 outplace_gt ..................................................... [PASS]
42/117 inplace_exp ..................................................... [WAIT]
42/117 inplace_exp ..................................................... [PASS]
43/117 test_gather_t ................................................... [WAIT]
43/117 test_gather_t ................................................... [PASS]
44/117 test_apply_on_gpu ............................................... [WAIT]
44/117 test_apply_on_gpu ............................................... [PASS]
45/117 operator_sub_scalar ............................................. [WAIT]
45/117 operator_sub_scalar ............................................. [PASS]
46/117 inplace_lt ...................................................... [WAIT]
46/117 inplace_lt ...................................................... [PASS]
47/117 test_get ........................................................ [WAIT]
47/117 test_get ........................................................ [PASS]
48/117 operator_plus_scalar ............................................ [WAIT]
48/117 operator_plus_scalar ............................................ [PASS]
49/117 inplace_cdiv .................................................... [WAIT]
49/117 inplace_cdiv .................................................... [PASS]
50/117 inplace_sin ..................................................... [WAIT]
50/117 inplace_sin ..................................................... [PASS]
51/117 test_sum_t ...................................................... [WAIT]
51/117 test_sum_t ...................................................... [PASS]
52/117 test_sumall ..................................................... [WAIT]
52/117 test_sumall ..................................................... [PASS]
53/117 test_gather_narrowed ............................................ [WAIT]
new wrapper, size 4
new wrapper, size 4
53/117 test_gather_narrowed ............................................ [PASS]
54/117 self_ge ......................................................... [WAIT]
54/117 self_ge ......................................................... [PASS]
55/117 operator_mul_scalar ............................................. [WAIT]
55/117 operator_mul_scalar ............................................. [PASS]
56/117 outplace_sigmoid ................................................ [WAIT]
56/117 outplace_sigmoid ................................................ [PASS]
57/117 test_indexfill .................................................. [WAIT]
57/117 test_indexfill .................................................. [PASS]
58/117 outplace_sign ................................................... [WAIT]
58/117 outplace_sign ................................................... [PASS]
59/117 test_cumprod .................................................... [WAIT]
59/117 test_cumprod .................................................... [PASS]
60/117 test_neg ........................................................ [WAIT]
60/117 test_neg ........................................................ [PASS]
61/117 test_mean ....................................................... [WAIT]
61/117 test_mean ....................................................... [PASS]
62/117 test_gather ..................................................... [WAIT]
62/117 test_gather ..................................................... [PASS]
63/117 test_sum ........................................................ [WAIT]
63/117 test_sum ........................................................ [PASS]
64/117 inplace_gt ...................................................... [WAIT]
64/117 inplace_gt ...................................................... [PASS]
65/117 test_cmin ....................................................... [WAIT]
65/117 test_cmin ....................................................... [PASS]
66/117 test_perelement ................................................. [WAIT]
66/117 test_perelement ................................................. [PASS]
67/117 test_min2 ....................................................... [WAIT]
67/117 test_min2 ....................................................... [PASS]
68/117 test_max1 ....................................................... [WAIT]
68/117 test_max1 ....................................................... [PASS]
69/117 self_ne ......................................................... [WAIT]
69/117 self_ne ......................................................... [PASS]
70/117 outplace_cos .................................................... [WAIT]
70/117 outplace_cos .................................................... [PASS]
71/117 inplace_ge ...................................................... [WAIT]
71/117 inplace_ge ...................................................... [PASS]
72/117 test_indexselect ................................................ [WAIT]
72/117 test_indexselect ................................................ [PASS]
73/117 inplace_add ..................................................... [WAIT]
73/117 inplace_add ..................................................... [PASS]
74/117 test_reshape .................................................... [WAIT]
74/117 test_reshape .................................................... [PASS]
75/117 test_addcdiv .................................................... [WAIT]
75/117 test_addcdiv .................................................... [PASS]
76/117 test_cmul ....................................................... [WAIT]
76/117 test_cmul ....................................................... [PASS]
77/117 test_fills ...................................................... [WAIT]
77/117 test_fills ...................................................... [PASS]
78/117 outplace_acos ................................................... [WAIT]
78/117 outplace_acos ................................................... [PASS]
79/117 inplace_floor ................................................... [WAIT]
79/117 inplace_floor ................................................... [PASS]
80/117 test_maskedSelect ............................................... [WAIT]
80/117 test_maskedSelect ............................................... [PASS]
81/117 test_blas ....................................................... [WAIT]
81/117 test_blas ....................................................... [PASS]
82/117 self_gt ......................................................... [WAIT]
82/117 self_gt ......................................................... [PASS]
83/117 outplace_ceil ................................................... [WAIT]
83/117 outplace_ceil ................................................... [PASS]
84/117 inplace_asin .................................................... [WAIT]
84/117 inplace_asin .................................................... [PASS]
85/117 inplace_sign .................................................... [WAIT]
85/117 inplace_sign .................................................... [PASS]
86/117 operator_sub .................................................... [WAIT]
86/117 operator_sub .................................................... [PASS]
87/117 outplace_abs .................................................... [WAIT]
87/117 outplace_abs .................................................... [PASS]
88/117 test_indexcopy .................................................. [WAIT]
88/117 test_indexcopy .................................................. [PASS]
89/117 outplace_round .................................................. [WAIT]
89/117 outplace_round .................................................. [PASS]
90/117 test_meanall .................................................... [WAIT]
90/117 test_meanall .................................................... [PASS]
91/117 test_cumsum ..................................................... [WAIT]
91/117 test_cumsum ..................................................... [PASS]
92/117 inplace_abs ..................................................... [WAIT]
92/117 inplace_abs ..................................................... [PASS]
93/117 outplace_le ..................................................... [WAIT]
93/117 outplace_le ..................................................... [PASS]
94/117 test_clone ...................................................... [WAIT]
94/117 test_clone ...................................................... [PASS]
95/117 test_map_on_gpu ................................................. [WAIT]
95/117 test_map_on_gpu ................................................. [PASS]
96/117 test_powerofneg ................................................. [WAIT]
96/117 test_powerofneg ................................................. [PASS]
97/117 inplace_cpow .................................................... [WAIT]
97/117 inplace_cpow .................................................... [PASS]
98/117 outplace_exp .................................................... [WAIT]
98/117 outplace_exp .................................................... [PASS]
99/117 outplace_floor .................................................. [WAIT]
99/117 outplace_floor .................................................. [PASS]
100/117 inplace_eq ...................................................... [WAIT]
100/117 inplace_eq ...................................................... [PASS]
101/117 outplace_sqrt ................................................... [WAIT]
101/117 outplace_sqrt ................................................... [PASS]
102/117 outplace_cinv ................................................... [WAIT]
102/117 outplace_cinv ................................................... [PASS]
103/117 test_sumallt .................................................... [WAIT]
103/117 test_sumallt .................................................... [PASS]
104/117 test_sum_t_offset ............................................... [WAIT]
104/117 test_sum_t_offset ............................................... [PASS]
105/117 test_map2_on_gpu ................................................ [WAIT]
105/117 test_map2_on_gpu ................................................ [PASS]
106/117 inplace_ceil .................................................... [WAIT]
106/117 inplace_ceil .................................................... [PASS]
107/117 outplace_ne ..................................................... [WAIT]
107/117 outplace_ne ..................................................... [PASS]
108/117 test_add ........................................................ [WAIT]
108/117 test_add ........................................................ [PASS]
109/117 test_prodall .................................................... [WAIT]
THClReduceAll.cl build log:
<program source>:9:10: warning: unused variable 'in1'
float *in1 = &_in1;
^
<program source>:10:10: warning: unused variable 'out'
float *out = &_out;
^
109/117 test_prodall .................................................... [PASS]
110/117 inplace_cmul .................................................... [WAIT]
110/117 inplace_cmul .................................................... [PASS]
111/117 outplace_lt ..................................................... [WAIT]
111/117 outplace_lt ..................................................... [PASS]
112/117 outplace_atan ................................................... [WAIT]
112/117 outplace_atan ................................................... [PASS]
113/117 inplace_ne ...................................................... [WAIT]
113/117 inplace_ne ...................................................... [PASS]
114/117 inplace_sigmoid ................................................. [WAIT]
114/117 inplace_sigmoid ................................................. [PASS]
115/117 self_le ......................................................... [WAIT]
115/117 self_le ......................................................... [PASS]
116/117 outplace_cmul ................................................... [WAIT]
116/117 outplace_cmul ................................................... [PASS]
117/117 test_save ....................................................... [WAIT]
117/117 test_save ....................................................... [PASS]
Completed 233 asserts in 117 tests with 0 failures and 0 errors and 1 warning
--------------------------------------------------------------------------------
Should use TestSuite rather than plain lua table
--------------------------------------------------------------------------------
all tests finished
$ luajit -l clnn -e 'clnn.test()'
libthclnn_searchpath /Users/tylerlindell/torch-cl/install/lib/lua/5.1/libTHCLNN.so
Running 74 tests
1/74 Square_transposed ................................................. [WAIT]Using Apple , OpenCL platform: Apple
Using OpenCL device: Iris
1/74 Square_transposed ................................................. [PASS]
2/74 TemporalConvolution2_forward ...................................... [PASS]
3/74 SpatialMaxPooling_forward ......................................... [PASS]
4/74 SoftMax_forward_batch ............................................. [PASS]
5/74 Sigmoid_forward ................................................... [PASS]
6/74 ELU_backward ...................................................... [PASS]
7/74 Threshold_forward ................................................. [PASS]
8/74 Threshold_backward_inplace ........................................ [PASS]
9/74 Tanh_transposed ................................................... [PASS]
10/74 SpatialUpSamplingNearest_forward_batch ............................ [WAIT]SpatialUpSamplingNearest.cl build log:
<program source>:3:20: warning: no previous prototype for function 'translate_idx'
/*__device__*/ int translate_idx(int ii, int d1, int d2, int d3, int scale_factor)
^
<program source>:20:20: warning: no previous prototype for function 'translate_idx_inv'
/*__device__*/ int translate_idx_inv(int ii, int d1, int d2, int d3, int scale_factor, int off_x, int off_y)
^
10/74 SpatialUpSamplingNearest_forward_batch ............................ [PASS]
11/74 Sigmoid_transposed ................................................ [PASS]
12/74 ClassNLLCriterionSingleTarget ..................................... [PASS]
13/74 mse_variablebatchsize ............................................. [PASS]
14/74 LogSigmoid_transposed ............................................. [PASS]
15/74 ClassNLLCriterionMultipleTarget ................................... [WAIT]THClReduceAll.cl build log:
<program source>:9:10: warning: unused variable 'in1'
float *in1 = &_in1;
^
<program source>:10:10: warning: unused variable 'out'
float *out = &_out;
^
15/74 ClassNLLCriterionMultipleTarget ................................... [PASS]
16/74 SoftMax_forward ................................................... [PASS]
17/74 LogSoftMax_forward ................................................ [PASS]
18/74 Tanh_forward ...................................................... [PASS]
19/74 CMul_forward_batch ................................................ [PASS]
20/74 Threshold_backward ................................................ [PASS]
21/74 mse ............................................................... [WAIT]Apply_3t_0s_0pt_-2_-2_-2_*out = 0.00043487714720591 * (*in1 - *in2) build log:
<program source>:37:12: warning: double precision constant requires cl_khr_fp64, casting to single precision
*out = 0.00043487714720591 * (*in1 - *in2);
^
21/74 mse ............................................................... [PASS]
22/74 SpatialAveragePooling_backward_batch .............................. [PASS]
23/74 ELU_forward ....................................................... [PASS]
24/74 Square_backward ................................................... [PASS]
25/74 SpatialMaxPooling_forward_batch_ceil .............................. [PASS]
26/74 LogSigmoid_backward ............................................... [PASS]
27/74 SpatialMaxPooling_backward_batch_ceil ............................. [PASS]
28/74 Sqrt_transposed ................................................... [PASS]
29/74 LookupTable_forward ............................................... [PASS]
30/74 ClassNLLCriterionSingleTargetScalar ............................... [PASS]
31/74 SpatialConvolutionMM_forward_single_vgglayer13 .................... [PASS]
32/74 SpatialAveragePooling_backward .................................... [PASS]
33/74 ELU_transposed .................................................... [PASS]
34/74 SpatialConvolutionMM_forward_batch ................................ [PASS]
35/74 Abs_backward ...................................................... [PASS]
36/74 mse_nosizeaverage ................................................. [WAIT]Apply_3t_0s_0pt_-2_-2_-2_*out = 0.00040675208460443 * (*in1 - *in2) build log:
<program source>:37:12: warning: double precision constant requires cl_khr_fp64, casting to single precision
*out = 0.00040675208460443 * (*in1 - *in2);
^
36/74 mse_nosizeaverage ................................................. [PASS]
37/74 Abs_forward ....................................................... [PASS]
38/74 SpatialConvolutionMM_forward_single_padded ........................ [PASS]
39/74 Threshold_transposed .............................................. [PASS]
40/74 LogSoftMax_backward ............................................... [PASS]
41/74 SpatialMaxPooling_backward_ceil ................................... [PASS]
42/74 Sum_backward ...................................................... [PASS]
43/74 Sqrt_backward ..................................................... [PASS]
44/74 Sum_forward ....................................................... [PASS]
45/74 Sqrt_zero ......................................................... [PASS]
46/74 SpatialConvolutionMM_forward_1d_byhand ............................ [PASS]
47/74 LogSigmoid_forward ................................................ [PASS]
48/74 Tanh_backward ..................................................... [PASS]
49/74 Square_forward .................................................... [PASS]
50/74 SpatialAveragePooling_forward_batch_ceil .......................... [PASS]
51/74 SpatialAveragePooling_backward_batch_ceil ......................... [PASS]
52/74 Abs_transposed .................................................... [PASS]
53/74 SoftMax_backward .................................................. [PASS]
54/74 LogSoftMax_backward_batch ......................................... [PASS]
55/74 SpatialUpSamplingNearest_backward ................................. [WAIT]SpatialUpSamplingNearest.cl build log:
<program source>:3:20: warning: no previous prototype for function 'translate_idx'
/*__device__*/ int translate_idx(int ii, int d1, int d2, int d3, int scale_factor)
^
<program source>:20:20: warning: no previous prototype for function 'translate_idx_inv'
/*__device__*/ int translate_idx_inv(int ii, int d1, int d2, int d3, int scale_factor, int off_x, int off_y)
^
55/74 SpatialUpSamplingNearest_backward ................................. [PASS]
56/74 SpatialConvolutionMM_backward_single .............................. [PASS]
57/74 SpatialAveragePooling_forward ..................................... [PASS]
58/74 TemporalConvolution2_backward_gradParams .......................... [WAIT]Apply_3t_0s_0pt_-2_-2_-2_*out = 0.0071428571428571 * (*in1 - *in2) build log:
<program source>:37:12: warning: double precision constant requires cl_khr_fp64, casting to single precision
*out = 0.0071428571428571 * (*in1 - *in2);
^
58/74 TemporalConvolution2_backward_gradParams .......................... [PASS]
59/74 Sigmoid_backward .................................................. [PASS]
60/74 SpatialAveragePooling_backward_ceil ............................... [PASS]
61/74 SpatialMaxPooling_forward_ceil .................................... [PASS]
62/74 Threshold_forward_inplace ......................................... [PASS]
63/74 SpatialMaxPooling_backward_batch .................................. [PASS]
64/74 SoftMax_backward_batch ............................................ [PASS]
65/74 SpatialAveragePooling_forward_batch ............................... [PASS]
66/74 TemporalConvolution2_backward_gradInput ........................... [PASS]
67/74 SpatialMaxPooling_backward ........................................ [PASS]
68/74 SpatialConvolutionMM_backward_batch ............................... [PASS]
69/74 LookupTable_backward .............................................. [WAIT]nDim 97 nInput 10 batch false error 0
nDim 97 nInput 10 batch true error 0
nDim 97 nInput 101 batch false error 0
nDim 97 nInput 101 batch true error 0
nDim 255 nInput 10 batch false error 0
nDim 255 nInput 10 batch true error 0
nDim 255 nInput 101 batch false error 0
nDim 255 nInput 101 batch true error 0
69/74 LookupTable_backward .............................................. [PASS]
70/74 SpatialAveragePooling_forward_ceil ................................ [PASS]
71/74 SpatialMaxPooling_forward_batch ................................... [PASS]
72/74 Sqrt_forward ...................................................... [PASS]
73/74 SpatialConvolutionMM_forward_single ............................... [PASS]
74/74 LogSoftMax_forward_batch .......................................... [PASS]
Completed 122 asserts in 74 tests with 0 failures and 0 errors
yeah, if its commented out, then its not implemented, and someone would need to implement it. the file you took a screenshot of is a header file, with declarations, not the implementation.
On 11 April 2017 05:02:47 CEST, TylerLindell notifications@github.com wrote:
it is commented out in that file but here is an image of all the places it shows up in
~/torch-cl/
<img width="1280" alt="screen shot 2017-04-10 at 8 00 13 pm" src="https://cloud.githubusercontent.com/assets/5748461/24891130/a5e278b6-1e28-11e7-9dc0-69de3bc6b14a.png">-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/hughperkins/distro-cl/issues/27#issuecomment-293136476
-- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Hey Hugh, cltorch is a great piece of software,
but does this lack mean that any standard deviation calculation on torch.ClTensor will fail?
I currently have the same problem, when running: inputs:std()
luajit: symbol lookup error: /home/user/torch-cl/install/lib/lua/5.1/libcltorch.so: undefined symbol: THClTensor_stdall
Is there any workaround? - standard deviation for me is fundamental :)
Could you copy to the cpu-side, and do standard deviation there?
I did, then this appeared:
libthclnn_searchpath /home/alex/torch-cl/install/lib/lua/5.1/libTHCLNN.so
Using Intel , OpenCL platform: Intel Gen OCL Driver
Using OpenCL device: Intel(R) HD Graphics IvyBridge M GT2
inputs : ClTensor - size: 100x30
targets : ClTensor - size: 100
Apply_3t_0s_0pt_-2_-2_-2_*out = 0.002 * (*in1 - *in2) build log:
stringInput.cl:37:12: warning: double precision constant requires cl_khr_fp64, casting to single precision
/home/alex/torch-cl/install/bin/luajit: /home/alex/torch-cl/install/share/lua/5.1/nn/Linear.lua:75: invalid arguments: ClTensor number number ClTensor ClTensor
expected arguments: *ClTensor~2D* [ClTensor~2D] [float] ClTensor~2D ClTensor~2D | *ClTensor~2D* float [ClTensor~2D] float ClTensor~2D ClTensor~2D
stack traceback:
[C]: in function 'addmm'
/home/alex/torch-cl/install/share/lua/5.1/nn/Linear.lua:75: in function 'updateGradInput'
/home/alex/torch-cl/install/share/lua/5.1/nn/Module.lua:30: in function 'backward'
/home/alex/torch-cl/install/share/lua/5.1/nn/Sequential.lua:84: in function 'backward'
test.lua:190: in function 'opfunc'
/home/alex/torch-cl/install/share/lua/5.1/optim/adam.lua:33: in function 'adam'
When I switch to cpu only everything runs alright, so maybe installing clblas and recompiling torch against libclblas clcblas would work as a workaround?
Well, you need to divide your network into one part that is on the gpu and one part that is on the cpu. I forget how to do this. I think there should be some module to handle this?
On 30 July 2017 23:55:02 BST, alex3s notifications@github.com wrote:
I did, then this appeared: `libthclnn_searchpath /home/alex/torch-cl/install/lib/lua/5.1/libTHCLNN.so
Using Intel , OpenCL platform: Intel Gen OCL Driver Using OpenCL device: Intel(R) HD Graphics IvyBridge M GT2traintargets : ClTensor - size: 100 traininputs : ClTensor - size: 100x30
Apply_3t_0s0pt-2-2-2_out = 0.002 (in1 - in2) build log: stringInput.cl:37:12: warning: double precision constant requires cl_khr_fp64, casting to single precision
/home/alex/torch-cl/install/bin/luajit: /home/alex/torch-cl/install/share/lua/5.1/nn/Linear.lua:75: invalid arguments: ClTensor number number ClTensor ClTensor expected arguments: ClTensor~2D [ClTensor~2D] [float] ClTensor~2D ClTensor~2D | ClTensor~2D float [ClTensor~2D] float ClTensor~2D ClTensor~2D stack traceback: [C]: in function 'addmm' /home/alex/torch-cl/install/share/lua/5.1/nn/Linear.lua:75: in function 'updateGradInput' /home/alex/torch-cl/install/share/lua/5.1/nn/Module.lua:30: in function 'backward' /home/alex/torch-cl/install/share/lua/5.1/nn/Sequential.lua:84: in function 'backward' test.lua:190: in function 'opfunc' /home/alex/torch-cl/install/share/lua/5.1/optim/adam.lua:33: in function 'adam' ` When I switch to cpu only everything runs alright, so maybe installing clblas and recompiling torch against libclblas clcblas would work as a workaround?
-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/hughperkins/distro-cl/issues/27#issuecomment-318935783
-- Sent from my Android device with K-9 Mail. Please excuse my brevity.
(You might need to make your own module that takss a cltensor as input and gives a float tensor as output, and visa versa for backprop)
On 30 July 2017 23:55:02 BST, alex3s notifications@github.com wrote:
I did, then this appeared: `libthclnn_searchpath /home/alex/torch-cl/install/lib/lua/5.1/libTHCLNN.so
Using Intel , OpenCL platform: Intel Gen OCL Driver Using OpenCL device: Intel(R) HD Graphics IvyBridge M GT2traintargets : ClTensor - size: 100 traininputs : ClTensor - size: 100x30
Apply_3t_0s0pt-2-2-2_out = 0.002 (in1 - in2) build log: stringInput.cl:37:12: warning: double precision constant requires cl_khr_fp64, casting to single precision
/home/alex/torch-cl/install/bin/luajit: /home/alex/torch-cl/install/share/lua/5.1/nn/Linear.lua:75: invalid arguments: ClTensor number number ClTensor ClTensor expected arguments: ClTensor~2D [ClTensor~2D] [float] ClTensor~2D ClTensor~2D | ClTensor~2D float [ClTensor~2D] float ClTensor~2D ClTensor~2D stack traceback: [C]: in function 'addmm' /home/alex/torch-cl/install/share/lua/5.1/nn/Linear.lua:75: in function 'updateGradInput' /home/alex/torch-cl/install/share/lua/5.1/nn/Module.lua:30: in function 'backward' /home/alex/torch-cl/install/share/lua/5.1/nn/Sequential.lua:84: in function 'backward' test.lua:190: in function 'opfunc' /home/alex/torch-cl/install/share/lua/5.1/optim/adam.lua:33: in function 'adam' ` When I switch to cpu only everything runs alright, so maybe installing clblas and recompiling torch against libclblas clcblas would work as a workaround?
-- You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub: https://github.com/hughperkins/distro-cl/issues/27#issuecomment-318935783
-- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Thank you. Switching to traininputs:double() nnoutputs:double() in a few places helped indeed. I'm testing the performance now if better than cpu only.
cool :-)
i'm getting the following error when using
trainset.data[{ {}, {i}, {}, {} }]:div(stdv[i]) -- std scaling
the code i'm using is here: