[Question] What kind of idea do you base the estimation of depth

ibaiGorordo / ONNX-Mobile-Human-Pose-3D

Python scripts for performing 3D human pose estimation using the Mobile Human Pose model in ONNX.

MIT License

122 stars 13 forks source link

[Question] What kind of idea do you base the estimation of depth #4

Closed iwatake2222 closed 2 years ago

iwatake2222 commented 2 years ago

Thank you for sharing. Could you tell me the idea behind the following line? https://github.com/ibaiGorordo/ONNX-Mobile-Human-Pose-3D/blob/main/mobileHumanPose/mobileHumanPose.py#L161

I understand you need abs_depth. For instance, using the average height will be one of solutions. focal_length(fixed value) : abs_depth(unknown value) = height_in_pixel(known value) : average_height(fixed value) Using the above equation, we can get abs_depth. (It may not work well, though)

In your code, it looks you use area, and add +500. The values may be determined by experiments, but could you tell me the basic idea of this?

ibaiGorordo commented 2 years ago

Hi, if I remember correctly in the original paper, they used the height as reference to estimate the depth as you mentioned. I think they considered the height and width of a person to be 2 x 2 m.

I tried that first but the results were not good, so with a bit trial an error I got to the 500 value. But it was mainly for visual representation and does not really represent the actual 3D pose.

So, unfortunately I have no good reason about the values, but the offset was there to make sure a person was mot very close to the camera (I add it during the different tests, so it might not be necessary).

Ibai

iwatake2222 commented 2 years ago

Thank you for answering the question.

the offset was there to make sure a person was mot very close to the camera

Very understood :grin: