facebookresearch / DensePose

A real-time approach for mapping all human pixels of 2D RGB images to a 3D surface-based model of the body
http://densepose.org
Other
6.98k stars 1.3k forks source link

Frame rate too low #158

Open rafikg opened 5 years ago

rafikg commented 5 years ago

@nvrv @ralpguler I installed correctly the densepose project and passing all the tests. I write a infer_simple_webcam to run the code using a webcam. with 240x320 image (as indicated in the original paper), I get a very low frame rate ~5-6 fps. However, in the original paper, they obtained 20-26 fps. Note: I had these warning when running the code:

WARNING cnn.py:  25: [====DEPRECATE WARNING====]: you are creating an object from CNNModelHelper class which will be deprecated soon. Please use ModelHelper object with brew module. For more information, please refer to caffe2.ai and python/brew.py, python/brew_test.py for more information.
INFO net.py:  51: Loading weights from: /tmp/detectron-download-cache/DensePose_ResNet101_FPN_s1x-e2e.pkl
I1127 15:42:31.615272 11619 operator.cc:195] Engine CUDNN is not available for operator MaxPool.
I1127 15:42:31.620499 11619 operator.cc:195] Engine CUDNN is not available for operator MaxPool.
I1127 15:42:31.621470 11619 net_dag_utils.cc:102] Operator graph pruning prior to chain compute took: 6.8039e-05 secs
I1127 15:42:31.624418 11619 operator.cc:195] Engine CUDNN is not available for operator MaxPool.

I1127 15:42:31.630476 11619 operator.cc:195] Engine CUDNN is not available for operator MaxPool.
I1127 15:42:31.630641 11619 net_dag_utils.cc:102] Operator graph pruning prior to chain compute took: 5.4918e-05 secs
I1127 15:42:31.632478 11619 net_dag_utils.cc:102] Operator graph pruning prior to chain compute took: 1.3686e-05 secs
I1127 15:42:33.769708 11619 net_async_base.h:201] Using specified CPU pool size: 4; device id: -1
I1127 15:42:33.769724 11619 net_async_base.h:206] Created new CPU pool, size: 4; device id: -1
I1127 15:42:34.504770 11619 net_async_base.h:201] Using specified CPU pool size: 4; device id: -1
I1127 15:42:34.504787 11619 net_async_base.h:206] Created new CPU pool, size: 4; device id: -1
therobotprogrammer commented 5 years ago

I'm having the same issue on a 2080ti. The GPU is only 20-28% used and memory is only 50% used (5487 MB). My inference time is .2 to .18s i.e. 5 to 6 FPS. I think it's using a batch size of 1. Not sure how to increase the batch size. I tried increasing IMS_PER_BATCH from 3 to 16 or 32 in the config yaml file but that doesn't do anything.

rafikg commented 5 years ago

@therobotprogrammer, I think you should forget about this project. It is not real time at all. Me and more than 2 persons tried it and we got too low frame rate....