DmitryUlyanov / online-neural-doodle

Feedforward neural doodle
183 stars 36 forks source link

Out of memory error #1

Open ghost opened 8 years ago

ghost commented 8 years ago

When I execute

CUDA_VISIBLE_DEVICES=0 th feedforward_neural_doodle.lua -model_name skip_noise_4 -masks_hdf5 data/starry/gen_doodles.hdf5 -batch_size 4 -num_mask_noise_times 0 -num_noise_channels 0 -learning_rate 1e-1 -half false

I get the followingh result:

[libprotobuf WARNING google/protobuf/io/coded_stream.cc:505] Reading dangerously large protocol message. If the message turns out to be larger than 1073741824 bytes, parsing will be halted for security reasons. To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h. [libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 574671192 Successfully loaded data/pretrained/VGG_ILSVRC_19_layers.caffemodel conv1_1: 64 3 3 3 conv1_2: 64 64 3 3 conv2_1: 128 64 3 3 conv2_2: 128 128 3 3 conv3_1: 256 128 3 3 conv3_2: 256 256 3 3 conv3_3: 256 256 3 3 conv3_4: 256 256 3 3 conv4_1: 512 256 3 3 conv4_2: 512 512 3 3 conv4_3: 512 512 3 3 conv4_4: 512 512 3 3 conv5_1: 512 512 3 3 conv5_2: 512 512 3 3 conv5_3: 512 512 3 3 conv5_4: 512 512 3 3 fc6: 1 1 25088 4096 fc7: 1 1 4096 4096 fc8: 1 1 4096 1000 Setting up style layer 2 : relu1_1 Replacing max pooling at layer 5 with average pooling
Setting up style layer 7 : relu2_1 Replacing max pooling at layer 10 with average pooling
Setting up style layer 12 : relu3_1 Replacing max pooling at layer 19 with average pooling
Setting up style layer 21 : relu4_1 Replacing max pooling at layer 28 with average pooling
Setting up style layer 30 : relu5_1 Optimize
THCudaCheck FAIL file=/tmp/luarocks_cutorch-scm-1-7288/cutorch/lib/THC/generic/THCStorage.cu line=41 error=2 : out of memory /home/andrew/torch/install/bin/luajit: /home/andrew/torch/install/share/lua/5.1/nn/Container.lua:67: In 3 module of nn.Sequential: In 1 module of nn.Sequential: /home/andrew/torch/install/share/lua/5.1/nn/THNN.lua:109: cuda runtime error (2) : out of memory at /tmp/luarocks_cutorch-scm-1-7288/cutorch/lib/THC/generic/THCStorage.cu:41 stack traceback: [C]: in function 'v' /home/andrew/torch/install/share/lua/5.1/nn/THNN.lua:109: in function 'SpatialReplicationPadding_updateGradInput' ...h/install/share/lua/5.1/nn/SpatialReplicationPadding.lua:41: in function 'updateGradInput' /home/andrew/torch/install/share/lua/5.1/nn/Module.lua:31: in function </home/andrew/torch/install/share/lua/5.1/nn/Module.lua:29> [C]: in function 'xpcall' /home/andrew/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors' /home/andrew/torch/install/share/lua/5.1/nn/Sequential.lua:88: in function </home/andrew/torch/install/share/lua/5.1/nn/Sequential.lua:78> [C]: in function 'xpcall' /home/andrew/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors' /home/andrew/torch/install/share/lua/5.1/nn/Sequential.lua:84: in function 'backward' feedforward_neural_doodle.lua:167: in function 'opfunc' /home/andrew/torch/install/share/lua/5.1/optim/adam.lua:33: in function 'optim_method' feedforward_neural_doodle.lua:199: in main chunk [C]: in function 'dofile' ...drew/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk [C]: at 0x00406670

I'm running with multiple GTX 980s, so GPU m,emory should not be an issue.

I have tried running with both -backend cudnn and -backen nn with no difference to the outcome.

I have been able to run the fast-neural-doodle project with no problems on my machine, so prerequisites such as python, torch and cuda appear to have been set up correctly.

Any idea of the cause of this problem?

DmitryUlyanov commented 8 years ago

Hello, I tested everything using 12GB card, so all the parameters tuned to work in my settings. You can try to decrease batch_size to 1 to see if it not fails, but it will train much worse with this small batch size.

You can reduce the image size to decrease memory consumption. I used 512x images, you can always go to any dimensions which are factor of 32, try 384x for example with the same batch_size.