CMU-Perceptual-Computing-Lab / caffe_rtpose

Realtime C++ code for multi-person pose estimation
Other
356 stars 207 forks source link

12G Titan X out of memory? #3

Closed yq1011 closed 7 years ago

yq1011 commented 7 years ago

Hi,

Thanks for sharing your code!

After I made it and tried to run the demo, I got an OOM error. I tried several video and pictures and the results are the same. That's really wired since the Titan X has a cuda memory of 12GB. What's the memory size it need for running the demo?

Thanks

ZheC commented 7 years ago

Thanks for reporting the issue.

It only requires 2~3 GB GPU memory if you run the code on an HD video with the option --net_resolution 656x368. Could you share your command for running the code?

yq1011 commented 7 years ago

Yes, thanks for your help!

My command is: ./build/examples/rtpose/rtpose.bin -video ../../dance.mp4

And adding the --net_resolution offers no help: $ ./build/examples/rtpose/rtpose.bin --video ../../dance.mp4 --net_resolution 656x368 E1230 14:46:26.032272 15582 rtpose.cpp:1507] Finish spawning 1 threads. now waiting. F1230 14:46:29.070701 15615 syncedmem.cpp:56] Check failed: error == cudaSuccess (2 vs. 0) out of memory Check failure stack trace: @ 0x7f72543dadaa (unknown) @ 0x7f72543dace4 (unknown) @ 0x7f72543da6e6 (unknown) @ 0x7f72543dd687 (unknown) @ 0x7f7254bdb442 caffe::SyncedMemory::to_gpu() @ 0x7f7254bda7b9 caffe::SyncedMemory::mutable_gpu_data() @ 0x7f7254a25132 caffe::Blob<>::mutable_gpu_data() @ 0x7f7254a98688 caffe::BaseConvolutionLayer<>::forward_gpu_gemm() @ 0x7f7254c410f6 caffe::ConvolutionLayer<>::Forward_gpu() @ 0x7f7254ba39e5 caffe::Net<>::ForwardFromTo() @ 0x40a04f warmup() @ 0x41030b processFrame() @ 0x7f7252848184 start_thread @ 0x7f725257537d (unknown) @ (nil) (unknown) Aborted (core dumped)

tsimk commented 7 years ago

Download cuDNN and enable it in your Makefile.config (it will also run faster). I've encountered this when using Caffe's vanilla convolution, although I'm not quite sure why this happens.

yq1011 commented 7 years ago

I see, I didn't enable cudnn in my own Makefile.config. That should be the reason.

gineshidalgo99 commented 7 years ago

I hope the problem was fixed after enabling cuDNN. I will close this issue thread. If after using cuDNN is still running out of memory on Titan X, please please let us know!

orgicus commented 7 years ago

I'm trying to run the demo on using a macbook with a GeForce 750M GPU (2GB VRAM), and GPU/CUDDN is enabled. Even if I turn of all programs except Terminal and set the settings to low, I still can't run the demo:

./build/examples/rtpose/rtpose.bin -caffemodel ./model/coco/pose_iter_440000.caffemodel -caffeproto ./model/coco/pose_deploy_linevec.prototxt -camera_resolution "40x30" -camera 0 -resolution "40x30" -start_scale 0.1 -num_scales=0 -no_display true -net_resolution "16x16"
F0401 11:03:52.527647 528384 cudnn_relu_layer.cpp:13] Check failed: status == CUDNN_STATUS_SUCCESS (4 vs. 0)  CUDNN_STATUS_INTERNAL_ERROR
*** Check failure stack trace: ***
    @        0x10a3af9da  google::LogMessage::Fail()
    @        0x10a3af0d5  google::LogMessage::SendToLog()
    @        0x10a3af63b  google::LogMessage::Flush()
    @        0x10a3b2a17  google::LogMessageFatal::~LogMessageFatal()
    @        0x10a3afcc7  google::LogMessageFatal::~LogMessageFatal()
    @        0x1054992be  caffe::CuDNNReLULayer<>::LayerSetUp()
    @        0x1055162a0  caffe::Net<>::Init()
    @        0x105517bde  caffe::Net<>::Net()
    @        0x1053b0248  warmup()
    @        0x1053ba11d  processFrame()
    @     0x7fff8b11899d  _pthread_body
    @     0x7fff8b11891a  _pthread_start
    @     0x7fff8b116351  thread_start
Abort trap: 6

Any hints on how I could run the demo on this hardware ?

gineshidalgo99 commented 7 years ago

Is cuDNN installed on your PC? You need to install it and then compiling all our code (make clean && make all)

orgicus commented 7 years ago

Hi,

I have installed cuDNN and I did enable it when I compiled.

The example config I used is this, this install script is this and I've made minor tweaks to the Makefile (removing -fopenmp from compilation and linking and -pthread from linking)

What should I be tweaking ?