mitmul / deeppose

DeepPose implementation in Chainer
http://static.googleusercontent.com/media/research.google.com/ja//pubs/archive/42237.pdf
GNU General Public License v2.0
408 stars 129 forks source link

Add minimum memory requirements to readme #6

Closed kylemcdonald closed 9 years ago

kylemcdonald commented 9 years ago

I tried running this on a basic card with 2GB of GPU RAM but i get an out of memory error as the training script is starting up. what is the minimum requirement to run this on the FLIC data?

kyle:deeppose kyle$ python scripts/train.py --model models/AlexNet_flic.py --gpu 0 --epoch 1000 --batchsize 128 --prefix AlexNet_LCN_AdaGrad_lr-0.0005 --snapshot 10 --datadir data/FLIC-full --channel 3 --flip True --size 220 --crop_pad_inf 1.5 --crop_pad_sup 2.0 --shift 5 --lcn True --joint_num 7
Traceback (most recent call last):
  File "scripts/train.py", line 246, in <module>
    trans, args, input_q, data_q)
  File "scripts/train.py", line 131, in train
    loss, pred = model.forward(input_data, label, train=True)
  File "models/AlexNet_flic.py", line 34, in forward
    h = F.local_response_normalization(h)
  File "/usr/local/lib/python2.7/site-packages/chainer-1.0.1-py2.7.egg/chainer/functions/local_response_normalization.py", line 123, in local_response_normalization
    return LocalResponseNormalization(n, k, alpha, beta)(x)
  File "/usr/local/lib/python2.7/site-packages/chainer-1.0.1-py2.7.egg/chainer/function.py", line 163, in __call__
    outputs = self.forward(in_data)
  File "/usr/local/lib/python2.7/site-packages/chainer-1.0.1-py2.7.egg/chainer/function.py", line 199, in forward
    return self.forward_gpu(inputs)
  File "/usr/local/lib/python2.7/site-packages/chainer-1.0.1-py2.7.egg/chainer/functions/local_response_normalization.py", line 68, in forward_gpu
    self.y = x[0] * x[0]  # temporary
  File "/usr/local/lib/python2.7/site-packages/chainer-1.0.1-py2.7.egg/chainer/cuda.py", line 718, in new_op
    return raw_op(self, other)
  File "/usr/local/lib/python2.7/site-packages/pycuda/gpuarray.py", line 468, in __mul__
    result = self._new_like_me(_get_common_dtype(self, other))
  File "/usr/local/lib/python2.7/site-packages/pycuda/gpuarray.py", line 401, in _new_like_me
    allocator=self.allocator, strides=strides)
  File "/usr/local/lib/python2.7/site-packages/pycuda/gpuarray.py", line 204, in __init__
    self.gpudata = self.allocator(self.size * self.dtype.itemsize)
  File "/usr/local/lib/python2.7/site-packages/chainer-1.0.1-py2.7.egg/chainer/cuda.py", line 352, in mem_alloc
    allocation = pool.allocate(nbytes)
pycuda._driver.MemoryError: memory_pool::allocate failed: out of memory - failed to free memory for allocation
mitmul commented 9 years ago

That depends on batchsize. If you run the train.py with the sample parameters written in README, the batchsize is 128, so that the required GPU memory size is over 2870MB. If you run train.py with batchsize=64, like:

python scripts/train.py \
--model models/AlexNet_flic.py \
--gpu 0 \
--epoch 1000 \
--batchsize 64 \
--prefix AlexNet_LCN_AdaGrad_lr-0.0005 \
--snapshot 10 \
--datadir data/FLIC-full \
--channel 3 \
--flip True \
--size 220 \
--crop_pad_inf 1.5 \
--crop_pad_sup 2.0 \
--shift 5 \
--lcn True \
--joint_num 7

The required GPU memory size is about 1890MB. If it still cause the memory error, please decrease batchsize more (e.g., --batchsize 32).

mitmul commented 9 years ago

I should add the above comments to README

mitmul commented 9 years ago

added: https://github.com/mitmul/deeppose/commit/568c8a57d382d8e68f3e3b1c6f2d15a6bf9d2d48

kylemcdonald commented 9 years ago

on my macbook pro with nvidia 750M the batch size of 64 was still too big, but 32 was small enough. thanks a bunch for the clarification!