demo_image speed not well

anatolix / keras_Realtime_Multi-Person_Pose_Estimation

Keras version of Realtime Multi-Person Pose Estimation project

Other

118 stars 47 forks source link

demo_image speed not well #5

Closed zyz207 closed 6 years ago

zyz207 commented 6 years ago

it takes about 5s when I run demo_image, my GPU is 1080Ti

anatolix commented 6 years ago

If it total time of program it is ok, becuase it loads 200M model in process. If it time of augmentation it is not, check it is really using your gpu, see example line from mine. 2017-11-17 16:42:05.460780: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1120] Creating TensorFlow device (/device:GPU:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:01:00.0, compute capability: 6.1)

pkdogcom commented 6 years ago

I've also done some testing using 1080Ti on a 720x1280 video and it takes about 1s for model to feed forward for 4 scales and another 1s for calculating part affinity fields. By manually reduce the resolution to 368x654 the total processing time will be brought down to about 0.65s but it is still much slower than the Caffe version.

pkdogcom commented 6 years ago

On a second look, almost half of the time in the predict loop is spent on post-processing, which is resizing, heatmap and part affinity fields. The high number of channels, 19 for heatmap and 38 for paf, as well as using the slower bicubic interpolation algorithm could be the cause for the slowdown. One way to improve this could be using CUDA version of resize function instead, but I haven't tested the difference yet.