jcjohnson / neural-style

Torch implementation of neural style algorithm
MIT License
18.31k stars 2.71k forks source link

THCudaCheck FAIL file=/home/mike/torch/extra/cutorch/lib/THC/generic/THCStorage.cu line=66 error=2 : out of memory #421

Open wungky opened 6 years ago

wungky commented 6 years ago

Dear neural-styles enthusiasts:

PREFACE

I am completely new to the world of linux and computing, better known in these parts as a "noob." I hope you all can be patient with my questions.

I'm dual booting Ubuntu 16.04.03 LTS with Windows 10 on a 1TB HD with 2 x Nvidia Gtx1050 in the hopes of generating neural-style images.

With much luck and poking around this repository, I finally have all dependencies installed and running.

//

I found that I am only able to run the basic neural-styles script with an absolute path; it doesnt work with a relative path. So, I modified my command from

th neural_style.lua -style_image my_style.jpg -content_image my_content.jpg

to

th neural_style.lua -style_image /home/mike/neural-style/examples/inputs/brad_pitt.jpg -content_image /home/mike/neural-style/examples/inputs/picasso_selfport1907.jpg

This yields what seems to be a series of issues:

[libprotobuf WARNING google/protobuf/io/coded_stream.cc:537] Reading dangerously large protocol message. If the message turns out to be larger than 1073741824 bytes, parsing will be halted for security reasons. To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h. [libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 574671192 Successfully loaded models/VGG_ILSVRC_19_layers.caffemodel conv1_1: 64 3 3 3 conv1_2: 64 64 3 3 conv2_1: 128 64 3 3 conv2_2: 128 128 3 3 conv3_1: 256 128 3 3 conv3_2: 256 256 3 3 conv3_3: 256 256 3 3 conv3_4: 256 256 3 3 conv4_1: 512 256 3 3 conv4_2: 512 512 3 3 conv4_3: 512 512 3 3 conv4_4: 512 512 3 3 conv5_1: 512 512 3 3 conv5_2: 512 512 3 3 conv5_3: 512 512 3 3 conv5_4: 512 512 3 3 fc6: 1 1 25088 4096 fc7: 1 1 4096 4096 fc8: 1 1 4096 1000 Setting up style layer 2 : relu1_1 Setting up style layer 7 : relu2_1 Setting up style layer 12 : relu3_1 Setting up style layer 21 : relu4_1 Setting up content layer 23 : relu4_2 Setting up style layer 30 : relu5_1 Capturing content targets
nn.Sequential { [input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> (9) -> (10) -> (11) -> (12) -> (13) -> (14) -> (15) -> (16) -> (17) -> (18) -> (19) -> (20) -> (21) -> (22) -> (23) -> (24) -> (25) -> (26) -> (27) -> (28) -> (29) -> (30) -> (31) -> (32) -> (33) -> (34) -> (35) -> (36) -> (37) -> output] (1): nn.TVLoss (2): nn.SpatialConvolution(3 -> 64, 3x3, 1,1, 1,1) (3): nn.ReLU (4): nn.StyleLoss (5): nn.SpatialConvolution(64 -> 64, 3x3, 1,1, 1,1) (6): nn.ReLU (7): nn.SpatialMaxPooling(2x2, 2,2) (8): nn.SpatialConvolution(64 -> 128, 3x3, 1,1, 1,1) (9): nn.ReLU (10): nn.StyleLoss (11): nn.SpatialConvolution(128 -> 128, 3x3, 1,1, 1,1) (12): nn.ReLU (13): nn.SpatialMaxPooling(2x2, 2,2) (14): nn.SpatialConvolution(128 -> 256, 3x3, 1,1, 1,1) (15): nn.ReLU (16): nn.StyleLoss (17): nn.SpatialConvolution(256 -> 256, 3x3, 1,1, 1,1) (18): nn.ReLU (19): nn.SpatialConvolution(256 -> 256, 3x3, 1,1, 1,1) (20): nn.ReLU (21): nn.SpatialConvolution(256 -> 256, 3x3, 1,1, 1,1) (22): nn.ReLU (23): nn.SpatialMaxPooling(2x2, 2,2) (24): nn.SpatialConvolution(256 -> 512, 3x3, 1,1, 1,1) (25): nn.ReLU (26): nn.StyleLoss (27): nn.SpatialConvolution(512 -> 512, 3x3, 1,1, 1,1) (28): nn.ReLU (29): nn.ContentLoss (30): nn.SpatialConvolution(512 -> 512, 3x3, 1,1, 1,1) (31): nn.ReLU (32): nn.SpatialConvolution(512 -> 512, 3x3, 1,1, 1,1) (33): nn.ReLU (34): nn.SpatialMaxPooling(2x2, 2,2) (35): nn.SpatialConvolution(512 -> 512, 3x3, 1,1, 1,1) (36): nn.ReLU (37): nn.StyleLoss } THCudaCheck FAIL file=/home/mike/torch/extra/cutorch/lib/THC/generic/THCStorage.cu line=66 error=2 : out of memory /home/mike/torch/install/bin/luajit: /home/mike/torch/install/share/lua/5.1/nn/Container.lua:67: In 5 module of nn.Sequential: /home/mike/torch/install/share/lua/5.1/nn/THNN.lua:110: cuda runtime error (2) : out of memory at /home/mike/torch/extra/cutorch/lib/THC/generic/THCStorage.cu:66 stack traceback: [C]: in function 'v' /home/mike/torch/install/share/lua/5.1/nn/THNN.lua:110: in function 'SpatialConvolutionMM_updateOutput' ...ke/torch/install/share/lua/5.1/nn/SpatialConvolution.lua:79: in function <...ke/torch/install/share/lua/5.1/nn/SpatialConvolution.lua:76> [C]: in function 'xpcall' /home/mike/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors' /home/mike/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward' neural_style.lua:162: in function 'main' neural_style.lua:601: in main chunk [C]: in function 'dofile' ...mike/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk [C]: at 0x00405d50

WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above. stack traceback: [C]: in function 'error' /home/mike/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors' /home/mike/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward' neural_style.lua:162: in function 'main' neural_style.lua:601: in main chunk [C]: in function 'dofile' ...mike/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk [C]: at 0x00405d50

ERRORS

The main error seems to be that I am running out of memory as noted here:

/home/mike/torch/install/share/lua/5.1/nn/THNN.lua:110: cuda runtime error (2) : out of memory at /home/mike/torch/extra/cutorch/lib/THC/generic/THCStorage.cu:66

However, when I monitor my GPU status via "nvidia-smi" my Gtx1050 aren't maxing out their VRAM performance.

And it also appears that there may be other errors per these messages:

[libprotobuf WARNING google/protobuf/io/coded_stream.cc:537] Reading dangerously large protocol message. If the message turns out to be larger than 1073741824 bytes, parsing will be halted for security reasons. To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h. [libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 574671192

and

WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above. stack traceback: [C]: in function 'error' /home/mike/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors' /home/mike/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward' neural_style.lua:162: in function 'main' neural_style.lua:601: in main chunk [C]: in function 'dofile' ...mike/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk [C]: at 0x00405d50

CONCLUSION

I would be most grateful if anyone could illuminate possible reasons for why I am these errors and warnings.

Thank you!

darkstorm2150 commented 6 years ago

It seems the default settings are causing it to say your error, try anything with -image_size 512 in your parameters or lower.

-image_size 400, or -image_size 300, etc

This error happens to me almost daily because I push the limits of the gpu's. Cheers!

wungky commented 6 years ago

Dear vic8760!

Thank you for your prompt reply! After changing my parameters to image_size 100, I got results!

Interestingly, I still get this warning message:

[libprotobuf WARNING google/protobuf/io/coded_stream.cc:537] Reading dangerously large protocol message. If the message turns out to be larger than 1073741824 bytes, parsing will be halted for security reasons. To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h. [libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 574671192

Do you know what this means?

darkstorm2150 commented 6 years ago

Good to hear!

Also the warning message I receive all the time, it doesn't affect the output. But I think its possible to disable it when compiling protobuf ? but thats another subject though...which I have no clue on. :)

VaKonS commented 6 years ago

@wungky, you can also try smaller models. It will make libprotobuf happy (it won't complain about model's size), and not sure, but maybe neural style will use less memory too.

One such small model is "VGG_ILSVRC_19_layers_conv.caffemodel" from "BETHGE LAB". You will need to:

Also, there is a list of other models to try in neural-style Wiki.