akashsengupta1997 / STRAPS-3DHumanShapePose

Code repository for the paper: Synthetic Training for Accurate 3D Human Pose and Shape Estimation in the Wild (BMVC 2020)
MIT License
165 stars 18 forks source link

Different results when images of person taken at different distances #6

Closed asimniazi63 closed 3 years ago

asimniazi63 commented 3 years ago

I have tried reconstructing same person based on various distances and results are hilarious. Would be appreciated if brief me about it and any solution about catering distances in your solution. Examples:

rend_front_avg_distance rend_front_max_distance rend_front_min_distance rend_front_very_max_distance rend_side_min_distance

akashsengupta1997 commented 3 years ago

Hi,

Yes the system at the time this code was released was quite susceptible to changes in relative person size, because the training inputs were cropped to a bounding box around the synthetic silhouettes/joints (with a small random bbox scaling factor for augmentation).

The standard way to deal with this is to also crop any test inputs around the detected silhouette/joints before 3D prediction to mimic the training data preprocessing, then un-crop after prediction, but it seems like I forgot to implement that in the code released here 😄. I'll get around to it when I've got some time.

You could also try increasing the random bbox scaling range in data augmentation to train the network to be more robust, but the test-time solution makes more sense and will probably work better.