Closed napsternxg closed 8 years ago
@napsternxg Ok, the error you are seeing is because nn.SpatialAveragePooling is missing :ceil()
mode. This is not just in clnn
but in standard nn
too. So, Sergey, @szagoruyko, is using a separate version of SpatialAveragePooling, which he has created at https://github.com/szagoruyko/imagine-nn imagine-nn is cuda only. So, seems like option 1 would be to port Sergey's AveragePooling to non-gpu torch, cuda torch, and cl torch (which sounds like a lot of work, might take a while...), or make an opencl port of inn. As you can tell, I'm leaning more towards that second option right now, but still a fair amount of work.
Note that this problem is specific to average pooling; nn.SpatialMaxPooling
had a :ceil()
method implemented in https://github.com/torch/nn/commit/929cfc57c88952b597bec77046582b90d1122380. Many common networks (AlexNet, CaffeNet, GoogLeNet, VGG-16, VGG-19) use max pooling rather than average pooling, but I guess the nin_imagenet_conv
model uses average pooling.
Oh thanks for that elaboration. So I believe we can't use nin_imagenet_conv
model. @hughperkins I think your solution about having opencl port of inn is best in the context of people using nin model.
Also any idea why I am also getting this error message for nin model.
[libprotobuf ERROR google/protobuf/text_format.cc:245] Error parsing text-format caffe.NetParameter: 1:4: Message type "caffe.NetParameter" has no field named "net".
Successfully loaded models/nin_imagenet_conv.caffemodel
MODULE data UNDEFINED
warning: module 'data [type 5]' not found
Per the original paper, they replaced the maxpooling layers in VGG with average pooling, and got better results. Therefore:
@hughperkins Actually replacing max pooling with average pooling is already implemented; just pass the flag -pooling avg
. I found that this gave worse results than max pooling, so I left max as the default.
However it's very possible either that I have a bug in the way I swap max pooling for average pooling, or that I just didn't try the right combinations of hyperparameters for average pooling.
@ Justin Ah, interesting :-)
Sorry, I didn't notice the image kernel size in my last post. For the nin model using the default image size gives the following error:
$ th neural_style.lua -style_image examples/inputs/picasso_selfport1907.jpg -content_image examples/inputs/brad_pitt.jpg -output_image profile.png -model_file models/nin_imagenet_conv.caffemodel -proto_file models/train_val.prototxt -gpu 0 -backend clnn -num_iterations 1000 -normalize_gradients -content_weight 50000 -style_weight 90000
Successfully loaded models/nin_imagenet_conv.caffemodel
MODULE data UNDEFINED
warning: module 'data [type 5]' not found
Changing line: require 'inn'
To line: require 'nn'
Changing line: table.insert(model, {'pool4', inn.SpatialAveragePooling(6, 6, 1, 1)})
To line: table.insert(model, {'pool4', nn.SpatialAveragePooling(6, 6, 1, 1)})
conv1: 96 3 11 11
cccp1: 96 96 1 1
cccp2: 96 96 1 1
conv2: 256 96 5 5
cccp3: 256 256 1 1
cccp4: 256 256 1 1
conv3: 384 256 3 3
cccp5: 384 384 1 1
cccp6: 384 384 1 1
conv4-1024: 1024 384 3 3
cccp7-1024: 1024 1024 1 1
cccp8-1024: 1000 1024 1 1
Using Advanced Micro Devices, Inc. , OpenCL platform: AMD Accelerated Parallel Processing
Using OpenCL device: Turks
/home/username/Downloads/torch/install/bin/luajit: ...y/Downloads/torch/install/share/lua/5.1/clnn/SoftMax.lua:31: SoftMax expects 1-d or 2-d tensor currently
stack traceback:
[C]: in function 'error'
...y/Downloads/torch/install/share/lua/5.1/clnn/SoftMax.lua:31: in function 'updateOutput'
.../Downloads/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
neural_style.lua:250: in function 'main'
neural_style.lua:484: in main chunk
[C]: in function 'dofile'
...oads/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:131: in main chunk
[C]: at 0x00406670
The error which bothers me the most is the following:
Successfully loaded models/nin_imagenet_conv.caffemodel
MODULE data UNDEFINED
warning: module 'data [type 5]' not found
I don't know what module data is.
Another important think to note: If you check the models.train_val.prototxt.opencl.lua
file you will see that the model has its last layer as Softmax
rather than ReLU
which is the case for VGG models.
I tried replacing the last layer of SpatialAveragePooling
to SpatialMaxPooling
but got the same error as above.
I dont think the module data undefined matters. Can you try it with gpu turned off, and see what it says?
For the error about SoftMax, looks like the cpu nn
has been updated to handle 3-d and 4-d input, as well as 1/2-d. I need to take a look at that.
Still getting error with nin model even with CPU usage. The code works fine with the vgg normalized model.
$ th neural_style.lua -style_image examples/inputs/picasso_selfport1907.jpg -content_image examples/inputs/brad_pitt.jpg -output_image profile.png -model_file models/nin_imagenet_conv.caffemodel -proto_file models/train_val.prototxt -gpu -1 -num_iterations 1000 -normalize_gradients -content_weight 50000 -style_weight 90000
Successfully loaded models/nin_imagenet_conv.caffemodel
MODULE data UNDEFINED
warning: module 'data [type 5]' not found
conv1: 96 3 11 11
cccp1: 96 96 1 1
cccp2: 96 96 1 1
conv2: 256 96 5 5
cccp3: 256 256 1 1
cccp4: 256 256 1 1
conv3: 384 256 3 3
cccp5: 384 384 1 1
cccp6: 384 384 1 1
conv4-1024: 1024 384 3 3
cccp7-1024: 1024 1024 1 1
cccp8-1024: 1000 1024 1 1
/home/username/Downloads/torch/install/bin/luajit: .../Downloads/torch/install/share/lua/5.1/nn/Sequential.lua:44: bad argument #1 to 'updateOutput' (vector or matrix expected at /tmp/luarocks_nn-scm-1-1415/nn/generic/SoftMax.c:24)
stack traceback:
[C]: in function 'updateOutput'
.../Downloads/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
neural_style.lua:250: in function 'main'
neural_style.lua:484: in main chunk
[C]: in function 'dofile'
...oads/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:131: in main chunk
I think you have an old version of nn. Can you luarocks install nn
please.
(well, when I say 'old', the merge was at 7am this morning :-P )
An orthogonal point: I don't see much reason to compute softmax layers for style transfer in general. If I counted correctly, the softmax is layer 30 in the NIN model; you can try deleting 30 from the style_layers array and you will never try to forward / backward through the softmax.
@ Justin, excellent point! Shubhanshu, can you follow Justin's idea of removing 30 from the style_layers array?
Tried that. Still doesn't work. Here is the new error I am getting:
$ th neural_style.lua -style_image examples/inputs/picasso_selfport1907.jpg -content_image examples/inputs/brad_pitt.jpg -output_image profile.png -model_file models/nin_imagenet_conv.caffemodel -proto_file models/train_val.prototxt -gpu 0 -backend clnn -num_iterations 1000 -normalize_gradients -content_weight 50000 -style_weight 90000
Successfully loaded models/nin_imagenet_conv.caffemodel
MODULE data UNDEFINED
warning: module 'data [type 5]' not found
Changing line: require 'inn'
To line: require 'nn'
Changing line: table.insert(model, {'pool4', inn.SpatialAveragePooling(6, 6, 1, 1)})
To line: table.insert(model, {'pool4', nn.SpatialAveragePooling(6, 6, 1, 1)})
Changing line: table.insert(model, {'loss', nn.SoftMax()})
To line:
conv1: 96 3 11 11
cccp1: 96 96 1 1
cccp2: 96 96 1 1
conv2: 256 96 5 5
cccp3: 256 256 1 1
cccp4: 256 256 1 1
conv3: 384 256 3 3
cccp5: 384 384 1 1
cccp6: 384 384 1 1
conv4-1024: 1024 384 3 3
cccp7-1024: 1024 1024 1 1
cccp8-1024: 1000 1024 1 1
Using Advanced Micro Devices, Inc. , OpenCL platform: AMD Accelerated Parallel Processing
Using OpenCL device: Turks
Apply_1t_1s_0pt_-2_*out = val1 build log:
"/tmp/OCL20600T5.cl", line 53: warning: variable "thisLinearId" was declared
but never referenced
int thisLinearId;
^
Apply_1t_0s_0pt_-2_*out = (*out > 0) ? *out : 0 build log:
"/tmp/OCL20600T19.cl", line 49: warning: variable "thisLinearId" was declared
but never referenced
int thisLinearId;
^
/tmp/luarocks_clnn-scm-1-9416/clnn/SpatialMaxPooling.cpp build log:
"/tmp/OCL20600T26.cl", line 24: warning: a value of type
"const __global float *" cannot be used to initialize an entity of
type "__global float *"
global Dtype *bottom_data = bottom_data_data + bottom_data_offset;
^
Apply_1t_1s_0pt_-2_*out *= val1 build log:
"/tmp/OCL20600T45.cl", line 53: warning: variable "thisLinearId" was declared
but never referenced
int thisLinearId;
^
Apply_2t_0s_0pt_-2_-2_*out *= *in1 build log:
"/tmp/OCL20600T48.cl", line 56: warning: variable "thisLinearId" was declared
but never referenced
int thisLinearId;
^
Running optimization with L-BFGS
/home/username/Downloads/torch/install/bin/luajit: ...torch/install/share/lua/5.1/nn/SpatialAveragePooling.lua:25: Not implemented at /tmp/luarocks_clnn-scm-1-9416/clnn/SpatialAveragePooling.cpp:185
stack traceback:
[C]: in function 'SpatialAveragePooling_updateGradInput'
...torch/install/share/lua/5.1/nn/SpatialAveragePooling.lua:25: in function 'updateGradInput'
...tity/Downloads/torch/install/share/lua/5.1/nn/Module.lua:30: in function 'backward'
.../Downloads/torch/install/share/lua/5.1/nn/Sequential.lua:84: in function 'backward'
neural_style.lua:305: in function 'opfunc'
...ty/Downloads/torch/install/share/lua/5.1/optim/lbfgs.lua:66: in function 'lbfgs'
neural_style.lua:324: in function 'main'
neural_style.lua:484: in main chunk
[C]: in function 'dofile'
...oads/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:131: in main chunk
[C]: at 0x00406670
I have also updated the code on the master branch of my repo with the loadcaffe_wrapper.lua
changes.
Hi Shubhanshu, yes thats normal. I'm working on implementing that. Actually, its not in cunn yet either, there is a PR in progress https://github.com/torch/cunn/pull/134 , however it is in Sergey's inn.
Can you update loadcaffe_wrapper to convert SpatialAveragePooling's into SpatialMaxPooling's, and try again?
(Edit: hmmm, a 6 by 6 max pooling :-P Anyway, it's less than that to be honest, it's more like 4 x 3 I think)
Ok progress here. The program finishes with the following message:
<optim.lbfgs> function value changing less than tolX
And no output image is generated.
hmmm. can you run it a few times perhaps? :-P Otherwise I guess you will need to use VGG for now, whilst I look at getting AveragePooling ported across.
Ok using -optimizer adam
starts computing the losses. Yay, more progress. Let me see how the output comes out to be.
Although the loss is coming out to be 0.00
till now.
Iteration 50 / 1000
Total loss: 0.000000
Iteration 100 / 1000
Total loss: 0.000000
Iteration 150 / 1000
Total loss: 0.000000
Maybe I need to play with the content_weight
and style_weight
as well.
So the program finishes using -optimizer adam
but the image generated is completely blank.
Loss of 0 doesnt sound good. If the loss is 0, its not going to try to learn anything.
What should happen is:
That's what it does for content. It will do someting simlar for texture, and mix those error signals togetherr.
If the loss is zero, then no training will take place.
If the loss is zero, it measn that hte output from the whitenoise, and the output from the photo are the same. I suppose this could happen if the output is a 0 by 0 tensor for example, or if th eoutput from both are all zeros for some reason.
Edit: Hmmmm.... isnt this kind o fthe point of the Softmax? to generate the erorr signal? Justin, are you sure we dont need the SoftMax? (Edit 2: hmmm, I guess the softmax is just rescaling the error signal actually? normalizing the output of hte network to be a probability distribution? acutally, if the output is zero everywhere, I guess softmax might just give nans anyway :-P)
Hi, AveragePooling is ported to clnn (edit: with :ceil
). You have to live on the bleeding-edge though :-P To use AveragePooling with clnn, you need to do:
git clone https://github.com/hughperkins/nn.git -b avepool_plus_master nn-avepool
cd nn-avepool
luarocks make rocks/nn-scm-1.rockspec
cd ..
git clone https://github.com/hughperkins/clnn.git -b avgpool clnn-avgpool
cd clnn-avgpool
luarocks make rocks/clnn-scm-1.rockspec
cd ..
luajit -l clnn -e 'clnn.test()'
You will then be able to use SpatialAveragePooling, and call :ceil()
on it, as per SpatialMaxPooling. You will need to modify one or both of loadcaffe and loadcaffe_wrapper , in order to use this.
Edit: output, for vgg, using both cu and cl, with averagepooling option enabled:
common vars:
its=1000
saveits=50
size=200
cu, th neural_style.lua -style_image examples/inputs/picasso_selfport1907.jpg -content_image examples/inputs/brad_pitt.jpg -gpu 0 -output_image cu_$name.png -image_size $size -model_file models/vgg_normalised.caffemodel -num_iterations $its -save_iter $saveits -normalize_gradients -content_weight 50000 -style_weight 90000 -seed 123 -pooling avg
cl, th neural_style.lua -style_image examples/inputs/picasso_selfport1907.jpg -content_image examples/inputs/brad_pitt.jpg -gpu 0 -output_image cl_$name.png -image_size $size -model_file models/vgg_normalised.caffemodel -num_iterations $its -save_iter $saveits -normalize_gradients -content_weight 50000 -style_weight 90000 -backend clnn -optimizer lbfgs -seed 123 -pooling avg
(And using https://github.com/hughperkins/neural-style.git branch master-nap)
Edit3: @vkorablin maybe using average pooling gives images closer to what you are looking for?
(Note that SoftMax has been re-ported from cunn to clnn just now, so if you luarocks install clnn
, you should have 3-d/4-d SoftMax available now) (Edit: and merged into branch avgpool too, so if you use avgpool, you should have both latest SoftMax, and latest SpatialAveragePooling)
@napsternxg You're getting close! My guess is that you're getting a blank image because the TV regularization strength is too high; at this point you should just turn it off with -tv_weight 0
.
Also usually when I see
<optim.lbfgs> function value changing less than tolX
I increase the content weight and style weight.
Ok so I tried both @hughperkins suggestion of updating using luarocks install clnn
and @jcjohnson suggestion of increasing content weight and style weight as well as setting -tv_weight 0
For then first case when I use the nin
model as is with the SoftMax
as well as the SpatialAveragePooling
layers. I get the following error saying SpatialAveragePooling_updateGradInput is not implemented
. Full error is as follows:
$ th neural_style.lua -style_image examples/inputs/picasso_selfport1907.jpg -content_image examples/inputs/brad_pitt.jpg -output_image profile.png -model_file models/nin_imagenet_conv.caffemodel -proto_file models/train_val.prototxt -gpu 0 -backend clnn -num_iterations 1000 -content_weight 100 -style_weight 100 -tv_weight 0
Running optimization with L-BFGS
/home/username/Downloads/torch/install/bin/luajit: ...torch/install/share/lua/5.1/nn/SpatialAveragePooling.lua:25: Not implemented at /tmp/luarocks_clnn-scm-1-9261/clnn/SpatialAveragePooling.cpp:185
stack traceback:
[C]: in function 'SpatialAveragePooling_updateGradInput'
...torch/install/share/lua/5.1/nn/SpatialAveragePooling.lua:25: in function 'updateGradInput'
...tity/Downloads/torch/install/share/lua/5.1/nn/Module.lua:30: in function 'backward'
.../Downloads/torch/install/share/lua/5.1/nn/Sequential.lua:84: in function 'backward'
neural_style.lua:305: in function 'opfunc'
...ty/Downloads/torch/install/share/lua/5.1/optim/lbfgs.lua:66: in function 'lbfgs'
neural_style.lua:324: in function 'main'
neural_style.lua:484: in main chunk
[C]: in function 'dofile'
...oads/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:131: in main chunk
[C]: at 0x00406670
Replacing the SpatialAveragePooling
with SpatialMaxPooling
I get the following error:
<optim.lbfgs> optimality condition below tolFun
For the above error I tried both of the following commands:
# Very low style and content weights
$ th neural_style.lua -style_image examples/inputs/picasso_selfport1907.jpg -content_image examples/inputs/brad_pitt.jpg -output_image profile.png -model_file models/nin_imagenet_conv.caffemodel -proto_file models/train_val.prototxt -gpu 0 -backend clnn -num_iterations 1000 -content_weight 100 -style_weight 100 -tv_weight 0
# Very high content and style weights
$ th neural_style.lua -style_image examples/inputs/picasso_selfport1907.jpg -content_image examples/inputs/brad_pitt.jpg -output_image profile.png -model_file models/nin_imagenet_conv.caffemodel -proto_file models/train_val.prototxt -gpu 0 -backend clnn -num_iterations 1000 -content_weight 10000000 -style_weight 1000000 -tv_weight 0
Hi, did you make sure to install the special branch of nn
too? (Edit: actually, I think you're not using the avgpool branch of clnn either. You ned to run the following, before using averagepooling:
git clone https://github.com/hughperkins/nn.git -b avepool_plus_master nn-avepool
cd nn-avepool
luarocks make rocks/nn-scm-1.rockspec
cd ..
git clone https://github.com/hughperkins/clnn.git -b avgpool clnn-avgpool
cd clnn-avgpool
luarocks make rocks/clnn-scm-1.rockspec
cd ..
@hughperkins oh yeah, I thought in the latest post you said you had the new features in the official clnn distribution. Anyway, I tried installing it with the commands you suggested and I got an error when using the lbfgs
optimizer but I got the program to finish successfully with adam
optimizer. However, when using adam I got losses as 0 for all iterations.
@jcjohnson I have used -tv_weight 0 -optimizer adam -normalize_gradients -content_weight 50000 -style_weight 90000 -seed 123 -pooling avg
This was my full command using adam:
$ th neural_style.lua -style_image examples/inputs/picasso_selfport1907.jpg -content_image examples/inputs/brad_pitt.jpg -output_image profile.png -model_file models/nin_imagenet_conv.caffemodel -proto_file models/train_val.prototxt -gpu 0 -backend clnn -num_iterations 1000 -tv_weight 0 -optimizer adam -normalize_gradients -content_weight 50000 -style_weight 90000 -seed 123 -pooling avg
This was using lbfgs
:
$ th neural_style.lua -style_image examples/inputs/picasso_selfport1907.jpg -content_image examples/inputs/brad_pitt.jpg -output_image profile.png -model_file models/nin_imagenet_conv.caffemodel -proto_file models/train_val.prototxt -gpu 0 -backend clnn -num_iterations 1000 -tv_weight 0 -normalize_gradients -content_weight 50000 -style_weight 90000 -seed 123 -pooling avg
Successfully loaded models/nin_imagenet_conv.caffemodel
MODULE data UNDEFINED
warning: module 'data [type 5]' not found
Changing line: require 'inn'
To line: require 'nn'
Changing line: table.insert(model, {'pool4', inn.SpatialAveragePooling(6, 6, 1, 1)})
To line: table.insert(model, {'pool4', nn.SpatialAveragePooling(6, 6, 1, 1)})
conv1: 96 3 11 11
cccp1: 96 96 1 1
cccp2: 96 96 1 1
conv2: 256 96 5 5
cccp3: 256 256 1 1
cccp4: 256 256 1 1
conv3: 384 256 3 3
cccp5: 384 384 1 1
cccp6: 384 384 1 1
conv4-1024: 1024 384 3 3
cccp7-1024: 1024 1024 1 1
cccp8-1024: 1000 1024 1 1
Using Advanced Micro Devices, Inc. , OpenCL platform: AMD Accelerated Parallel Processing
Using OpenCL device: Turks
Replacing max pooling at layer 7 with average pooling
Replacing max pooling at layer 14 with average pooling
Replacing max pooling at layer 21 with average pooling
Running optimization with L-BFGS
<optim.lbfgs> optimality condition below tolFun
Even when I remove the tv_weight 0
option and increase the content and style weights a lot, I keep getting the lbfgs messages:
$ th neural_style.lua -style_image examples/inputs/picasso_selfport1907.jpg -content_image examples/inputs/brad_pitt.jpg -output_image profile.png -model_file models/nin_imagenet_conv.caffemodel -proto_file models/train_val.prototxt -gpu 0 -backend clnn -num_iterations 1000 -normalize_gradients -content_weight 5000000 -style_weight 9000000 -seed 123 -pooling avg
<optim.lbfgs> function value changing less than tolX
I think you migh tneed to modify loadcaffe_wrapper, to add :ceil()
to the averagepooling. you can have a look at https://github.com/hughperkins/neural-style/blob/master-nap-plushacks/loadcaffe_wrapper.lua#L68-L70 :
if line:find("SpatialAveragePooling") then
line = line:gsub("%}%)", ":ceil()})")
end
Hi @napsternxg , it occurs to me, are you making sure to add some style and content loss layers? I mean, when you are using nin. eg, in Justin's current master, you have to specify some layers, in the options 'content_layers' and 'style_layers'. If you put names here that dont match the nin_layers, then maybe there will be no content/style loss layers, and that would explain the zero loss you are seeing?
@hughperkins thanks a lot for the suggestion about changing the content_layers
and style_layers
. So, finally with your version of clnn I was able to run it to finish and got some useful results.
So with the following command I am getting the following results. Still not the best one, but I will have to play with the hyperparameters to get to the best results.
$ th neural_style.lua -style_image examples/inputs/picasso_selfport1907.jpg -content_image examples/inputs/brad_pitt.jpg -output_image profile.png -model_file models/nin_imagenet_conv.caffemodel -proto_file models/train_val.prototxt -gpu 0 -backend clnn -num_iterations 1000 -seed 123 -content_layers relu6 -style_layers relu0,relu3,relu7,relu10 -content_weight 2 -style_weight 100 -image_size 320 -optimizer lbfgs
I think this solves all the issues we have have with porting to opencl. Once the exact settings are finalized, we can close this issue. @hughperkins thanks a lot for being patient and helping through with all the steps. @jcjohnson and @vkorablin thanks for your support.
I have also updated my code to work with the nin_imagenet_conv.caffemodel
file.
https://github.com/napsternxg/neural-style
@napsternxg Cooollll :-) Once all the details are ironed out, please consider submitting a pull request to Justin for the OpenCL changes. I'd love to see OpenCL working in the Core neural-style repo :-)
@hughperkins Yeah will play with some of the parameters and update it. It would be great if you can also have your average pooling changes in the final clnn package.
Yes, waiting for averagepooling to be merged into clnn master is not a bad idea. I need to wait for averagepooling changes to be merged into torch7 master, and then I will merge into clnn master.
By the way, @vkorablin , thank you for drawing my attention to this issue, and thus to this project. This project, and this paper, is fascinating. I'm really happy to have found this.
@hughperkins have you seen this issue before. I am playing with multiple layers and parameters to get this model working with the nin model and this is one error I am getting. I believe it is coming from cltorch.
$ th neural_style.lua -style_image examples/inputs/picasso_selfport1907.jpg -content_image examples/inputs/brad_pitt.jpg -output_image profile.png -model_file models/nin_imagenet_conv.caffemodel -proto_file models/train_val.prototxt -gpu 0 -backend clnn -num_iterations 1000 -seed 123 -content_layers relu0,relu3,relu7,relu12 -style_layers relu0,relu3,relu7,relu12 -content_weight 2 -style_weight 100 -image_size 320 -optimizer lbfgs
Iteration 50 / 1000
Content 1 loss: 17073.860726
Content 2 loss: 44512.276923
Content 3 loss: 19215.190643
Content 4 loss: 173.713984
Style 1 loss: 16628.293186
Style 2 loss: 19602.140808
Style 3 loss: 10859.522163
Style 4 loss: 0.628987
Total loss: 128065.627420
/home/username/Downloads/torch/install/bin/luajit: ...ty/Downloads/torch/install/share/lua/5.1/optim/lbfgs.lua:165: clblasSdot() failed with -4 at /tmp/luarocks_cltorch-scm-1-159/cltorch/cltorch/src/lib/THClBlas.cpp:186
stack traceback:
[C]: in function 'dot'
...ty/Downloads/torch/install/share/lua/5.1/optim/lbfgs.lua:165: in function 'lbfgs'
neural_style.lua:324: in function 'main'
neural_style.lua:484: in main chunk
[C]: in function 'dofile'
...oads/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:131: in main chunk
[C]: at 0x00406670
PS: Note that I am using multiple layers for both content_layer and style_layers.
-4 means out of memory. From cl.h:
#define CL_MEM_OBJECT_ALLOCATION_FAILURE -4
Ok got it.
I think you can remove accGradParameters. That cuts down on memory slightly. Have a look at https://github.com/hughperkins/neural-style/commit/d9e4dd43677ee766ad39f2745e6cab7f8210ae7d
Got it. Changed that and now I am able to run it using multiple content and style layers. And I am getting pretty promising results.
If you run it with the following command:
$ th neural_style.lua -style_image examples/inputs/picasso_selfport1907.jpg -content_image examples/inputs/brad_pitt.jpg -output_image profile.png -model_file models/nin_imagenet_conv.caffemodel -proto_file models/train_val.prototxt -gpu 0 -backend clnn -num_iterations 1000 -seed 123 -content_layers relu0,relu3,relu7,relu12 -style_layers relu0,relu3,relu7,relu12 -content_weight 10 -style_weight 1000 -image_size 320 -optimizer adam
You will get image like the following:
@jcjohnson what do you think of this output. It was constructed using the nin_imagenet_conv.caffemodel
file. .
Got it. Changed that and now I am able to run it using multiple content and style layers.
Cool :-)
@napsternxg Looks pretty good to me - better than the results I got with CaffeNet, but not quite as nice as the VGG-19 results.
At any rate it looks like the OpenCL port is pretty much working as intended at this point; I'm happy to merge into master if you send me a PR.
@jcjohnson I can send the pull request but it will not work out of the box. @hughperkins has made some changes to torch code as well as clnn code for average pooling which may cause an issue. I will clean up some of the things on my side and update the code on my repo as of now. I think once the clnn issue is fixed we can merge it into your repo.
Sounds good to me.
Hi. I've created a new version of clnn which uses less memory. Comparing with other versions, on my 1GB NVIDIA card:
master
branch, image size 200 works, 256 failsmulti-conv
branch, image_size 300 works, 320 failsYou need to install branch multi-conv
of https://github.com/hughperkins/clnn ,and then I tested it as follows:
master-nap-plushacks
of https://github.com/hughperkins/neural-styleth neural_style.lua -style_image examples/inputs/picasso_selfport1907.jpg -content_image examples/inputs/brad_pitt.jpg -gpu 0 -output_image profile.png -image_size 300 -model_file models/vgg_normalised.caffemodel -backend clnn -num_iterations 1000 -save_iter 50 -normalize_gradients -content_weight 50000 -style_weight 90000
th neural_style.lua -style_image examples/inputs/picasso_selfport1907.jpg -content_image examples/inputs/brad_pitt.jpg -gpu 1 -output_image profile.png -image_size 300 -model_file models/vgg_normalised.caffemodel -backend clnn -num_iterations 1000 -save_iter 50 -normalize_gradients -content_weight 50000 -style_weight 90000 /home/ceperez/torch/install/bin/luajit: /home/ceperez/torch/install/share/lua/5.1/trepl/init.lua:363: module 'cutorch' not found:No LuaRocks module found for cutorch no field package.preload['cutorch'] no file '/home/ceperez/.luarocks/share/lua/5.1/cutorch.lua' no file '/home/ceperez/.luarocks/share/lua/5.1/cutorch/init.lua' no file '/home/ceperez/torch/install/share/lua/5.1/cutorch.lua' no file '/home/ceperez/torch/install/share/lua/5.1/cutorch/init.lua' no file './cutorch.lua' no file '/home/ceperez/torch/install/share/luajit-2.1.0-alpha/cutorch.lua' no file '/usr/local/share/lua/5.1/cutorch.lua' no file '/usr/local/share/lua/5.1/cutorch/init.lua' no file '/home/ceperez/.luarocks/lib/lua/5.1/cutorch.so' no file '/home/ceperez/torch/install/lib/lua/5.1/cutorch.so' no file './cutorch.so' no file '/usr/local/lib/lua/5.1/cutorch.so' no file '/usr/local/lib/lua/5.1/loadall.so' stack traceback: [C]: in function 'error' /home/ceperez/torch/install/share/lua/5.1/trepl/init.lua:363: in function 'require' neural_style.lua:48: in function 'main' neural_style.lua:437: in main chunk [C]: in function 'dofile' ...erez/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:133: in main chunk [C]: at 0x004064d0
Hi codeaudit, this thread is getting a bit crazy long :-P Do you mind opening a new issue in https://github.com/hughperkins/clnn/issues please? Also, please provide the exact commit, branch, and repository that you are running from. It looks like you are using a branch/commit that is importing cutorch
for some reason, but I need to know the exact branch etc to check more closely.
it give me this :Successfully loaded models/nin_imagenet_conv.caffemodel MODULE data UNDEFINED warning: module 'data [type 5]' not found /home/rob/torch/install/bin/luajit: models/train_val.prototxt.opencl.lua:4: bad argument #1 to 'insert' (table expected, got nil) stack traceback: [C]: in function 'insert' models/train_val.prototxt.opencl.lua:4: in main chunk [C]: in function 'dofile' ./loadcaffe_wrapper.lua:77: in function 'load' neural_style.lua:66: in function 'main' neural_style.lua:484: in main chunk [C]: in function 'dofile' .../rob/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk [C]: at 0x00406670 any clues?
I tried implementing OpenCL support and the code is at: https://github.com/napsternxg/neural-style/tree/opencl
However I get the following error when running the code:
I believe the issue is because of the SpatialConvolutionMM which is implemented in ccn2 module.