CMU-Perceptual-Computing-Lab / caffe_rtpose

Realtime C++ code for multi-person pose estimation
Other
357 stars 207 forks source link

sporadic GPU out of memory #11

Closed marketto89 closed 7 years ago

marketto89 commented 7 years ago

Hi everyone,

Thank you for sharing this amazing work!

I am opening this issue because I am experiencing a sporadic GPU out of memory error.

It was not weird to see it the first time, because I am using a laptop with a NVidia Quadro K1100M which has only 2GB RAM.

Nevertheless, the weird thing is that after a log-out/log-in the error disappears letting me run the rtpose binary (even with the resolution which should ask for 3GB RAM). Do you have an explanation for this? How could I prevent to log-out/log-in every time I am seeing the error?

I am running the rtpose demo on a Linux Mint 17.3 (derived from Ubuntu 14.04). Here is the stack trace:

F0123 10:43:52.732602  6844 syncedmem.cpp:56] Check failed: error == cudaSuccess (2 vs. 0)  out of memory
*** Check failure stack trace: ***
    @     0x7fabd153fdaa  (unknown)
    @     0x7fabd153fce4  (unknown)
    @     0x7fabd153f6e6  (unknown)
    @     0x7fabd1542687  (unknown)
    @     0x7fabd1d1f962  caffe::SyncedMemory::to_gpu()
    @     0x7fabd1d1ecd9  caffe::SyncedMemory::mutable_gpu_data()
    @     0x7fabd1cfb1a2  caffe::Blob<>::mutable_gpu_data()
    @     0x7fabd1db0080  caffe::NmsLayer<>::Forward_gpu()
    @     0x7fabd1d2cc1b  caffe::Net<>::ForwardFromTo()
    @           0x409bcf  warmup()
    @           0x40fc2b  processFrame()
    @     0x7fabcf9ac184  start_thread
    @     0x7fabcf6d937d  (unknown)
    @              (nil)  (unknown)
Aborted
gineshidalgo99 commented 7 years ago

Our software itself asks for around 2080 Mb of GPU memory for the default configuration in Ubuntu 16. We found this number varies depending on OpenCV and other 3rd party software used. Hence, in your PC it might fit in your 2 Gb GPU memory.

However, the other background software and the OS itself also uses the GPU. In my Ubuntu 16 version, around 600 Mb of GPU memory is continuously allocated for other background software (nvidia-smi to check it).

So in your case, the program might run out of memory when other applications are taking resources from the GPU, while it might work right after starting your user session because your PC might have not allocated those GPU resources yet.

I am closing this issue, but please post again if you have any further problem!