mks0601 / 3DMPPE_POSENET_RELEASE

Official PyTorch implementation of "Camera Distance-aware Top-down Approach for 3D Multi-person Pose Estimation from a Single RGB Image", ICCV 2019
MIT License
817 stars 147 forks source link

About running time reported in the paper #39

Closed waghmaregovind closed 4 years ago

waghmaregovind commented 4 years ago

Greetings, In supplementary material of the paper, Table 8 shows running time for each component. My question is specifically about inference time of RootNet and PoseNet. As RootNet and PoseNet work on bounding boxes, is frame same as that of bounding box in this context? I think the numbers reported are with respect to per bounding box and not per image as single image can contain multiple people.

I'm providing two examples to clarify it further.

  1. Consider an image with 1 person in it. So total time will be 0.141 s.
  2. Consider an image with 3 people in it. So, as per my understanding, DetectNet will take 0.12 s, RootNet will take 3 x 0.01 s = 0.03 s and PoseNet will take 3 x 0.011 = 0.033 s. So, total time for this image is 0.12+0.03+0.033 = 0.183 s -is this understanding correct?

Thanks for releasing the source.

mks0601 commented 4 years ago

Hi, sorry for unclear description about the table. As you said, the running times of RootNet and PoseNet are calculated for each bounding box.

However, the running time of them are not exactly proportional to the number of boxes in the image because of parallel processing. One can make mini-batch with all bounding boxes of an image and fed the mini-batch to the models. This is also applicable in real-world scenario because all boxes that is detected from the DetectNet can be fed to the PoseNet and RootNet in one time, unless there are too many people in the image, which results in GPU OOM.

I checked that in case of the PoseNet, processing 8 boxes in parallel takes 3 times more time than that of processing 1 box In case of RootNet, processing 8 boxes in parallel takes 2 times more time than that of processing 1 box.

Hope this can clarify your question.

waghmaregovind commented 4 years ago

Your response does clarify the issue. As you pointed out, example 2 given by me does not consider parallel processing and should be considered with respect to batch. Thanks for the prompt response and additional run-time evaluations.