Closed napsternxg closed 8 years ago
Here is the output:
$ th neural_style_opencl.lua -style_image examples/inputs/picasso_selfport1907.jpg -content_image examples/inputs/brad_pitt.jpg -gpu 0 -backend 'clnn' -output_image profile.png -image_size 25 -model_file models/vgg_normalised.caffemodel -optimizer adam
In Function main
Starting load model
In loadcaffe_load
Successfully loaded models/vgg_normalised.caffemodel
Finished proto to lua
conv1_1: 64 3 3 3
conv1_2: 64 64 3 3
conv2_1: 128 64 3 3
conv2_2: 128 128 3 3
conv3_1: 256 128 3 3
conv3_2: 256 256 3 3
conv3_3: 256 256 3 3
conv3_4: 256 256 3 3
conv4_1: 512 256 3 3
conv4_2: 512 512 3 3
conv4_3: 512 512 3 3
conv4_4: 512 512 3 3
conv5_1: 512 512 3 3
conv5_2: 512 512 3 3
conv5_3: 512 512 3 3
conv5_4: 512 512 3 3
Finished iterations clnn
Finished network setup
Using Advanced Micro Devices, Inc. , OpenCL platform: AMD Accelerated Parallel Processing
Using OpenCL device: Turks
Finished content Image preprocess
Finished style Image preprocess
Finished caffe variables
Starting network setup
input:size()
3
25
19
[torch.LongStorage of size 3]
currentOutput:size()
3
25
19
[torch.LongStorage of size 3]
self.modules[ 1 ]= nn.TVLoss
currentOutput:size()
3
25
19
[torch.LongStorage of size 3]
self.modules[ 2 ]= nn.SpatialConvolutionMM(3 -> 64, 3x3, 1,1, 1,1)
Apply_1t_1s_0pt_-2_*out = val1 build log:
"/tmp/OCL19013T5.cl", line 53: warning: variable "thisLinearId" was declared
but never referenced
int thisLinearId;
^
currentOutput:size()
64
25
19
[torch.LongStorage of size 3]
self.modules[ 3 ]= nn.ReLU
Apply_1t_0s_0pt_-2_*out = (*out > 0) ? *out : 0 build log:
"/tmp/OCL19013T19.cl", line 49: warning: variable "thisLinearId" was declared
but never referenced
int thisLinearId;
^
input:size()
64
25
19
[torch.LongStorage of size 3]
currentOutput:size()
64
25
19
[torch.LongStorage of size 3]
self.modules[ 1 ]= nn.View
currentOutput:size()
64
475
[torch.LongStorage of size 2]
self.modules[ 2 ]= nn.ConcatTable {
input
|`-> (1): nn.Idusername
|`-> (2): nn.Idusername
... -> output
}
/home/username/Downloads/torch/install/bin/luajit: .../Downloads/torch/install/share/lua/5.1/nn/Sequential.lua:45: attempt to call method 'size' (a nil value)
stack traceback:
.../Downloads/torch/install/share/lua/5.1/nn/Sequential.lua:45: in function 'forward'
neural_style_opencl.lua:150: in function 'main'
neural_style_opencl.lua:424: in main chunk
[C]: in function 'dofile'
...oads/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:131: in main chunk
[C]: at 0x00406670
(I'm trying to install neural-style by the way, hence the pause in my replies :-P )
Cool. I can replicate the problem on my machine:
self.modules[ 2 ]= nn.ConcatTable {
input
|`-> (1): nn.Identity
|`-> (2): nn.Identity
... -> output
}
/home/user/torch/install/bin/luajit: /home/user/torch/install/share/lua/5.1/nn/Sequential.lua:45: attempt to call method 'size' (a nil value)
stack traceback:
Ah, might be missing nn.MM
module :-P
(Edit: seems like ConcatTable works ok:
a = nn.ConcatTable()
a:add(nn.Linear(3,2))
a:add(nn.Linear(3,2))
A = torch.Tensor(3):uniform()
-- a:forward(A)[1]
-- -0.2971
-- 0.1385
-- [torch.DoubleTensor of size 2]
a:forward(A)[2]
-- similar output
acl = a:clone():cl()
torch.type(acl.modules[1].weight)
-- torch.ClTensor
Acl = A:cl()
acl:forward(Acl)[1]
-- -0.2971
-- 0.1385
-- [torch.ClTensor of size 2]
) (Edit 2: and GramMatrix seems to work ok actually. First revert the Sequential.lua changes, then do:
th
require 'nn'
function GramMatrix()
local net = nn.Sequential()
net:add(nn.View(-1):setNumInputDims(2))
local concat = nn.ConcatTable()
concat:add(nn.Identity())
concat:add(nn.Identity())
net:add(concat)
net:add(nn.MM(false, true))
return net
end
g = GramMatrix()
g:forward(torch.Tensor(3,2,4):uniform())
-- works ok
require 'clnn'
gcl = g:clone():cl()
gcl:forward(torch.ClTensor(3,2,4):uniform())
-- works ok
)
Am I missing the nn.MM
module ?
Ok, the problem is right at the start of the network. Basically, if you put the following at line 261, you can see the network:
print('net', net)
Then, there are lots of layers, but first two are:
(1): nn.TVLoss
(2): nn.SpatialConvolutionMM(3 -> 64, 3x3, 1,1, 1,1)
Now, if you hack the Sequential.lua file with the following:
function Sequential:updateOutput(input)
if input == nil then
print('input nil')
else
if input.size ~= nil then
print('input:size()', input:size())
else
print('input.size nil')
end
end
local currentOutput = input
for i=1,#self.modules do
print('self.modules[', i , ']=', self.modules[i])
if currentOutput == nil then
print('currentoutput nil')
else
if currentOutput.size ~= nil then
print('currentOutput:size()', currentOutput:size())
else
print('currentoutput.size is nil')
end
end
currentOutput = self.modules[i]:updateOutput(currentOutput)
end
self.output = currentOutput
return currentOutput
end
... then you will get the following output:
self.modules[ 1 ]= nn.TVLoss
currentOutput:size()
1425
[torch.LongStorage of size 1]
self.modules[ 2 ]= nn.SpatialConvolutionMM(3 -> 64, 3x3, 1,1, 1,1)
currentOutput:size()
1425
[torch.LongStorage of size 1]
The output of TVLoss is a 1-dimensional tensor of length 1425, but SpatialConvolutionMM (at least the opencl version, for now....) expects a 3 or 4 dimensional vector. Now, in theory, we can add a Reshape
layer into the network, line 108, add somehting like:
net:add(nn.Reshape(3, 25, 25))
... but strangely 3 * 25 * 25
== 1875
!= 1425
, so thats a bit odd. I'm not sure why these lengths mismatch yet, but I'm pretty sure that the problem is with a tensor size/shape mismatch between the TVLoss output and the following SpatialConvolutionMM layer input.
(Edit: Hmmm, actually, not quite this, since just after printing the network, this runs ok:
Running optimization with ADAM
input:size()
3
25
19
[torch.LongStorage of size 3]
self.modules[ 1 ]= nn.TVLoss
currentOutput:size()
3
25
19
[torch.LongStorage of size 3]
self.modules[ 2 ]= nn.SpatialConvolutionMM(3 -> 64, 3x3, 1,1, 1,1)
currentOutput:size()
3
25
19
[torch.LongStorage of size 3]
The crash comes later. Maybe the incoming image is too small, and then after a few poolings it is 1x1? Kind of a mystery :-P )
I reckon we should try with a smaller model first. Any suggestions on an appropriately really small model to try? Goal is not to get good image output, just to check it runs ok, and then can try a larger model later.
(eg maybe a mnist lenet-5 or someting like that perhaps?)
Try with the vgg_normalized.caffeemodel, that is the small model I tried working with. Don't know of any other smaller model.
vgg normalized is physically small(er), but it's still got 19 layers. lenet-5 has like 5 layers or so.
Dont think we need pretrained weights for now. Suffiicent just to leave the weights initialized with random numbers.
ok, I hacked lines 62 or so of the neural_style_opencl.lua script, to have a single max pooling layer:
-- local cnn = loadcaffe_wrap.load(params.proto_file, params.model_file, params.backend):float()
cnn = nn.Sequential()
cnn:add(nn.SpatialMaxPooling(2,2,2,2,0,0))
cnn:float()
self.modules[ 2 ]= nn.SpatialMaxPooling(2,2,2,2)
currentOutput:size()
36
[torch.LongStorage of size 1]
/home/user/torch/install/bin/luajit: ...ser/torch/install/share/lua/5.1/nn/SpatialMaxPooling.lua:36: bad argument #2 to 'SpatialMaxPooling_updateOutput' (3D or 4D (batch) tensor expected)
stack traceback:
Running like this:
th neural_style_opencl.lua -style_image examples/inputs/picasso_selfport1907.jpg -content_image examples/inputs/brad_pitt.jpg -gpu 0 -backend 'clnn' -output_image profile.png -image_size 4 -model_file models/vgg_normalised.caffemodel -optimizer adam
Edit: full output in failed case is now relatively short, so easy to compare with non-cl output. cl output is: http://pastebin.com/XYximE1f (and for cpu, is http://pastebin.com/pPx23RBr )
(Edit 2:
if you modify the feval
function as follows:
local function feval(x)
print('feval x:size()', x:size())
num_calls = num_calls + 1
net:forward(x)
local grad = net:backward(x, dy)
local loss = 0
for _, mod in ipairs(content_losses) do
loss = loss + mod.loss
end
for _, mod in ipairs(style_losses) do
loss = loss + mod.loss
end
maybe_print(num_calls, loss)
maybe_save(num_calls)
collectgarbage()
-- optim.lbfgs expects a vector for gradients
print('loss', loss)
print('grad:size()', grad:size())
return loss, grad:view(grad:nElement())
end
... then you will notice that:
x
tensor is 3dgrad
tensor is 3d, correctly:view()
functionI was briefly concerned that the :view()
function was broken in cl, but seems not to be:
a = torch.ClTensor(3,4,5):uniform()
a:view(3*4*5)
-- shows a 1d tensor
a
-- continues to show a 3d tensor, ie hasnt unintentionally modified the original a tensor
)
Ah, looks like :addcdiv
in cltorch reshapes the tensor, but in torch and cutorch does not. In adam.lua, line 63:
x:addcdiv(-stepSize, state.m, state.denom)
... causes the tensor to suddenly change from 3d to 1d. I need to look into this.
Hmmm... but ... cutorch does the same thing actually:
require 'cutorch'
a = torch.CudaTensor(3,2,4):uniform()
b = torch.CudaTensor(3*2*4):uniform()
a:addcdiv(1,b,b)
a:size()
-- 1 dimension...
Edit: ah, most recent cutorch fixes this :-)
Ok. Please update to latest cltorch, ie luarocks install cltorch
, and then retry. For me, the following command runs ok to completion:
th neural_style_opencl.lua -style_image examples/inputs/picasso_selfport1907.jpg -content_image examples/inputs/brad_pitt.jpg -gpu 0 -output_image profile.png -image_size 32 -model_file models/vgg_normalised.caffemodel -optimizer adam -num_iterations 3 -backend clnn
@hughperkins
Same here. :+1:
Cool :-)
Hi. I've updated cltorch to allow get/set on individual elements. So, lbfgs might work now. Please feel free to luarocks install cltorch
, and see to what extent lbfgs works for you.
Hi Shubhanshu, I've added your OpenCL port to the clnn readme by the way :-) https://github.com/hughperkins/clnn#example-networks
@hughperkins thanks a lot. Yes this runs and generates images. Really appreciate adding my example on the link.
Although, I think because of the memory of my GPU, I can't generate any reasonable output even after using -image_size=150
. This was the full command:
th neural_style_opencl.lua -style_image examples/inputs/picasso_selfport1907.jpg -content_image examples/inputs/brad_pitt.jpg -gpu 0 -output_image profile.png -image_size 150 -model_file models/vgg_normalised.caffemodel -backend clnn
But I am glad that it will run for someone with a better GPU.
Here are some of the images I got.
I believe using the larger model is the best bet. Maybe @jcjohnson can elaborate on this.
I am trying to run it with the nin_imagenet_conv
model and I am getting the error about SpatialAveragePooling_updateOutput
not implemented.
Here is the command:
$ th neural_style_opencl.lua -style_image examples/inputs/picasso_selfport1907.jpg -content_image examples/inputs/brad_pitt.jpg -gpu 0 -output_image profile.png -model_file models/nin_imagenet_conv.caffemodel -proto_file models/solver.prototxt -backend clnn
Here is the error:
/home/username/Downloads/torch/install/bin/luajit: ...torch/install/share/lua/5.1/nn/SpatialAveragePooling.lua:14: Not implemented at /tmp/luarocks_clnn-scm-1-2534/clnn/SpatialAveragePooling.cpp:59
stack traceback:
[C]: in function 'SpatialAveragePooling_updateOutput'
...torch/install/share/lua/5.1/nn/SpatialAveragePooling.lua:14: in function 'updateOutput'
.../Downloads/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
neural_style_opencl.lua:149: in function 'main'
neural_style_opencl.lua:424: in main chunk
[C]: in function 'dofile'
...oads/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:131: in main chunk
[C]: at 0x00406670
For the images, per the FAQ, I reckon that big blocks of continuous color means your tv is too high. @jcjohnson Is that right?
For the nin_imagenet_conv, can you state the parameters of the averagepooling layer? ie, what is the pool size, the input size, and the stride?
Hi Shubhanshu, for nin_imagenet_conv, I've updated clnn to handle a very specific averagepooling non-batched geometry. Can you luarocks install clnn
, and retry please? If it still fails, then I need to know the exact geometry you are using, ie input size, pool size, and stride.
@napsternxg @hughperkins It looks like it's working! When you use the normalized network the default values for content weight, style weight, and TV weight won't give good results; in particular you should reduce the TV weight by an order of magnitude or more.
Also if you use a network other than VGG-19 or its normalized variety, you'll need to change the layers used for style and content reconstruction. At master you can select these with the -style_layers
and -content_layers
flags, but it looks like you forked before those were added; you'll instead want to change the indices of the style and content layers here https://github.com/napsternxg/neural-style/blob/opencl/neural_style_opencl.lua#L90
Could someone test this on proper hardware w/ the default settings?
I've been playing with the normalized model at 256px size (on a 7750 w/ just 1GB GPU RAM), and while it shows some interesting results, one problem I've found is that it transplants more than just textures onto the target image. For example, here's a somewhat creepy Picasso/Pitt hybrid:
(Command line: th neural_style_opencl.lua -style_image examples/inputs/picasso_selfport1907.jpg -content_image examples/inputs/brad_pitt.jpg -gpu 0 -output_image profile.png -image_size 256 -model_file models/vgg_normalised.caffemodel -backend clnn -num_iterations 1000 -save_iter 50 -normalize_gradients -content_weight 50000 -style_weight 90000
)
Would be good to make sure it's the hyperparameter choice or the model as opposed to the port.
How long does this take to train approximately?
@hughperkins w/ the command line I've given it's maybe 10 minutes or so on my hardware. Broad features will become clear after ~200 iterations (2-3 minutes?).
Hmmm, I get an error -4, memory object allocation failure, just at end of second block of 50 iterations. I have a 1GB card too (GeForce 940M). If we can modify the commandline to use a little bit less memory, I should probably be able to run both CUDA and OpenCL on it.
The most straightforward thing is to lower image size. -image_size 200
perhaps?
Ok. Trying to brush the cobwebs off my cunn at the moment. Giving me some odd error about /home/user/torch/install/share/lua/5.1/cunn/init.lua:9: attempt to index field '_flattenTensorBuffer' (a nil value)
. Digging...
edit: hmmm, needs a new module called inn
. Installing...
Ok, cunn runs now. Out of memory using cunn with imagesize 256. Trying 200...
after 750, using cunn:
(Edit: the commandline used: th neural_style.lua -style_image examples/inputs/picasso_selfport1907.jpg -content_image examples/inputs/brad_pitt.jpg -gpu 0 -output_image profile.png -image_size 200 -model_file models/vgg_normalised.caffemodel -num_iterations 1000 -save_iter 50 -normalize_gradients -content_weight 50000 -style_weight 90000
)
Thank you! Looks quite similar indeed.
Does seem fairly convincing. We should probably fix the random seed, to be sure.
Hmmm, if I put torch.manualSeed(123)
at line 12, then, each clnn run is identical to each other, and each cunn run is identical to each other, but the clnn and cunn outputs are slightly different. after 50 iterations:
cunn:
clnn:
Hmmm, and what is more, cpu gives same results as cuda. So I probably need to dig a bit. For 10 iterations:
cpu:
cuda:
cl:
Edit: hmmm, but -gpuid -1
with neural_style_opencl.lua gives similar results to gpuiid 0
with neural_style_opencl.lua, so might just be slightly different forks:
Edit2: ok, looks like if the manualSeed is at line 186 or so, just after line -- initialize the image
, then cpu, cuda, cl all almost agree, except that cl has a fairly blank margin down the right hand side. So I reckon one of the paddings in one of the layers has an issue somehow, somewhere. Will continue digging...
So, I've written the following script, to compare between cl, cuda, cpu: http://pastebin.com/jRGhyPij
numlayers
layers, between cuda and cpu, and between cl and cpusumabsdiffcl 0.00019182558025932
maxabsdiffcl 4.6193599700928e-07
sumabsdiffcu 0.00019182558025932
maxabsdiffcu 4.6193599700928e-07
sumabsdiffcl 0.0003578155010473
maxabsdiffcl 3.5762786865234e-07
sumabsdiffcu 0.00027025085728383
maxabsdiffcu 2.9802322387695e-07
... so I probably need to check what is happening on layer 13 (which is: (13): nn.SpatialConvolutionMM(256 -> 256, 3x3, 1,1, 1,1)
)
Edit2: fairly sure the vgg forwards/backwards is correct. Using this script: http://pastebin.com/1d73iQWK It does full forwards/backwards pass through vgg, for imagesize of 128. It does this 3 times: for cpu, for cl, for cuda. Then it compares the results, normalizes to 0-1 range, and saves to pngs. The results for cl-vs-cu, cl-vs-cpu, cu-vs-cpu are below. The cl and cuda vs cpu plots are comparable. There is no artifact down the right hand margin.
cl vs cu:
cl vs cpu:
cu vs cpu:
@hughperkins I updated clnn and ran using the nin model. I am getting error in the SpatialAveragePooling_updateOutput
function
allocate workbuffer
/home/username/Downloads/torch/install/bin/luajit: ...torch/install/share/lua/5.1/nn/SpatialAveragePooling.lua:14: bad argument #2 to 'SpatialAveragePooling_updateOutput' (input image smaller than kernel size)
stack traceback:
[C]: in function 'SpatialAveragePooling_updateOutput'
...torch/install/share/lua/5.1/nn/SpatialAveragePooling.lua:14: in function 'updateOutput'
.../Downloads/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
neural_style_opencl.lua:143: in function 'main'
neural_style_opencl.lua:418: in main chunk
[C]: in function 'dofile'
...oads/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:131: in main chunk
[C]: at 0x00406670
Hi Shubhanshu, the error 'input image smaller than kernel size' normally means the image size is too small. Normally it starts large, but the multiple poolings reduces it each time. Can you try a larger image size please? (By the way, can you paste appropriate wget commands, or similar, so I can try the nin model too please? I downloaded some kind of nin model, but it doesnt have the '_conv' suffix, so not sure if is the same one?)
@hughperkins I increased the image size to the default 512. Now I get another error. This is possibly some issue in nn module.
$ th neural_style_opencl.lua -style_image examples/inputs/picasso_selfport1907.jpg -content_image examples/inputs/brad_pitt.jpg -gpu 0 -output_image profile.png -model_file models/nin_imagenet_conv.caffemodel -proto_file models/solver.prototxt -backend clnn -num_iterations 3
/home/username/Downloads/torch/install/bin/luajit: ...ity/Downloads/torch/install/share/lua/5.1/nn/SoftMax.lua:4: attempt to call field 'SoftMax_updateOutput' (a nil value)
stack traceback:
...ity/Downloads/torch/install/share/lua/5.1/nn/SoftMax.lua:4: in function 'updateOutput'
.../Downloads/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
neural_style_opencl.lua:143: in function 'main'
neural_style_opencl.lua:418: in main chunk
[C]: in function 'dofile'
...oads/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:131: in main chunk
[C]: at 0x00406670
Also, I downloaded the nin model from https://github.com/BVLC/caffe/wiki/Model-Zoo#network-in-network-model All the required files are in the google drive link. https://drive.google.com/folderview?id=0B0IedYUunOQINEFtUi1QNWVhVVU&usp=drive_web
Hi Shubhanshu, yes:
luarocks install clnn
, and try again please?(oh, for the disparity between cl and cu output, I think it's because 'ceil' actually changes the output size of the max pooling. So, I probably need to implement ceil
, if we want the output to be the same between cl and cu)
For neural-style, I don't think exact binary compatibility between cuda and opencl is a strict requirement; tiny differences should be fine as long as the same hyperparameters and inputs produce similar outputs. Of course, exactly matching the cuda outputs would be better.
For other applications though, ceil
would be a great addition to cltorch
since all of the caffe pretrained models rely on it.
For other applications though, ceil would be a great addition to cltorch since all of the caffe pretrained models rely on it.
Ok, good info. Thanks! :-)
Hi guys, please note that :ceil()
mode is implemented for clnn SpatialMaxPooling layer now. If you luarocks install clnn
, you should have access.
With :ceil()
implemented, I think the results now are much more similar, between cl and cu. I cant quite decide whether the residual differences are because of rounding, of if there is still some small fundamental difference. For 100 iterations, image size 100:
cu:
cl:
Edit: here are the same settings as the earlier images, ie size=200, its=10. no longer an artefact down the right hand side, images look almost identical:
cu:
cl:
(To repeat these, just put torch.manualSeed(123)
just after the comment -- Initialize the image
, and use geometry and commandline something like:
th neural_style_opencl.lua -style_image examples/inputs/picasso_selfport1907.jpg -content_image examples/inputs/brad_pitt.jpg -gpu 0 -output_image cl_$name.png -image_size $size -model_file models/vgg_normalised.caffemodel -num_iterations $its -save_iter $its -normalize_gradients -content_weight 50000 -style_weight 90000 -backend clnn -optimizer lbfgs
)
Edit 3: using size=200, iterations=1000: cu: cl:
Not quite the same, but fairly close, I think?
(Edit 4: Hmmm, I suppose an interesting question is: if I take the cu image, and give it to cl, is it a local minimum for cl too? and similarly for cl image giving to cu)
Looks pretty good to me! If you wanted to track down the difference, I'd run for one iteration and dump all activations and gradients to a file and compare between clnn and cunn.
However it looks close enough, so if you want to rebase and clean up for a PR I'm happy to merge.
Using the command shown below and the vgg_normalized.caffemodel
file I am getting similar output as above. However, the results are not as good as the one using the full model.
$ th neural_style_opencl.lua -style_image examples/inputs/picasso_selfport1907.jpg -content_image examples/inputs/brad_pitt.jpg -output_image profile.png -model_file models/vgg_normalised.caffemodel -gpu 0 -backend clnn -image_size 150 -num_iterations 1000 -normalize_gradients -content_weight 50000 -style_weight 90000
@hughperkins could you get your code to work with the nin_imagenet_conv.caffemodel
. I couldn't get it to run. I am still getting the following errors:
[libprotobuf ERROR google/protobuf/text_format.cc:245] Error parsing text-format caffe.NetParameter: 1:4: Message type "caffe.NetParameter" has no field named "net".
Successfully loaded models/nin_imagenet_conv.caffemodel
MODULE data UNDEFINED
warning: module 'data [type 5]' not found
As well as the following:
/home/username/Downloads/torch/install/bin/luajit: ...torch/install/share/lua/5.1/nn/SpatialAveragePooling.lua:14: bad argument #2 to 'SpatialAveragePooling_updateOutput' (input image smaller than kernel size)
stack traceback:
[C]: in function 'SpatialAveragePooling_updateOutput'
...torch/install/share/lua/5.1/nn/SpatialAveragePooling.lua:14: in function 'updateOutput'
.../Downloads/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
neural_style_opencl.lua:143: in function 'main'
neural_style_opencl.lua:418: in main chunk
[C]: in function 'dofile'
...oads/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:131: in main chunk
[C]: at 0x00406670
Here is the full command I used and the corresponding processing log and errors.
$ th neural_style_opencl.lua -style_image examples/inputs/picasso_selfport1907.jpg -content_image examples/inputs/brad_pitt.jpg -output_image profile.png -model_file models/nin_imagenet_conv.caffemodel -proto_file models/solver.prototxt -gpu 0 -backend clnn -image_size 150 -num_iterations 1000 -normalize_gradients -content_weight 50000 -style_weight 90000
[libprotobuf ERROR google/protobuf/text_format.cc:245] Error parsing text-format caffe.NetParameter: 1:4: Message type "caffe.NetParameter" has no field named "net".
Successfully loaded models/nin_imagenet_conv.caffemodel
MODULE data UNDEFINED
warning: module 'data [type 5]' not found
conv1: 96 3 11 11
cccp1: 96 96 1 1
cccp2: 96 96 1 1
conv2: 256 96 5 5
cccp3: 256 256 1 1
cccp4: 256 256 1 1
conv3: 384 256 3 3
cccp5: 384 384 1 1
cccp6: 384 384 1 1
conv4-1024: 1024 384 3 3
cccp7-1024: 1024 1024 1 1
cccp8-1024: 1000 1024 1 1
Using Advanced Micro Devices, Inc. , OpenCL platform: AMD Accelerated Parallel Processing
Using OpenCL device: Turks
Apply_1t_1s_0pt_-2_*out = val1 build log:
"/tmp/OCL14862T5.cl", line 53: warning: variable "thisLinearId" was declared
but never referenced
int thisLinearId;
^
Apply_1t_0s_0pt_-2_*out = (*out > 0) ? *out : 0 build log:
"/tmp/OCL14862T19.cl", line 49: warning: variable "thisLinearId" was declared
but never referenced
int thisLinearId;
^
Apply_1t_1s_0pt_-2_*out *= val1 build log:
"/tmp/OCL14862T26.cl", line 53: warning: variable "thisLinearId" was declared
but never referenced
int thisLinearId;
^
Apply_2t_0s_0pt_-2_-2_*out -= *in1 build log:
"/tmp/OCL14862T29.cl", line 56: warning: variable "thisLinearId" was declared
but never referenced
int thisLinearId;
^
Apply_1t_1s_0pt_-2_*out = pown(*out, val1) build log:
"/tmp/OCL14862T32.cl", line 53: warning: variable "thisLinearId" was declared
but never referenced
int thisLinearId;
^
THClReduceAll.cl build log:
"/tmp/OCL14862T38.cl", line 9: warning: variable "in1" was declared but never
referenced
float *in1 = &_in1;
^
"/tmp/OCL14862T38.cl", line 10: warning: variable "out" was declared but never
referenced
float *out = &_out;
^
/tmp/luarocks_clnn-scm-1-9416/clnn/SpatialMaxPooling.cpp build log:
"/tmp/OCL14862T46.cl", line 24: warning: a value of type
"const __global float *" cannot be used to initialize an entity of
type "__global float *"
global Dtype *bottom_data = bottom_data_data + bottom_data_offset;
^
Apply_2t_0s_0pt_-2_-2_*out *= *in1 build log:
"/tmp/OCL14862T61.cl", line 56: warning: variable "thisLinearId" was declared
but never referenced
int thisLinearId;
^
/home/username/Downloads/torch/install/bin/luajit: ...torch/install/share/lua/5.1/nn/SpatialAveragePooling.lua:14: bad argument #2 to 'SpatialAveragePooling_updateOutput' (input image smaller than kernel size)
stack traceback:
[C]: in function 'SpatialAveragePooling_updateOutput'
...torch/install/share/lua/5.1/nn/SpatialAveragePooling.lua:14: in function 'updateOutput'
.../Downloads/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
neural_style_opencl.lua:143: in function 'main'
neural_style_opencl.lua:418: in main chunk
[C]: in function 'dofile'
...oads/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:131: in main chunk
[C]: at 0x00406670
I tried implementing OpenCL support and the code is at: https://github.com/napsternxg/neural-style/tree/opencl
However I get the following error when running the code:
I believe the issue is because of the SpatialConvolutionMM which is implemented in ccn2 module.