Still learning 3D keypoints if even the person is truncated in the image?

Arthur151 / ROMP

Monocular, One-stage, Regression of Multiple 3D People and their 3D positions & trajectories in camera & global coordinates. ROMP[ICCV21], BEV[CVPR22], TRACE[CVPR2023]

Apache License 2.0

1.36k stars 231 forks source link

In image_dataset.py, if we activate shuffle_mode and then the image will be cropped. Now if a person is cropped out of the region, the 2d kps will mostly fall out of the image region, and these outlier 2d keypoints will be updated to (-2 , -2)

However, if there is one 2d keypoint still remaining inside the image, for example a head point like this:

Then the 2d keypoints of this person on the right will be updated to [[-2, -2], [-2, -2], ... [x_head, y_head], ...[-2, -2]]. So the 2d loss will only be calculated for head joint.

However, in the current code, the 3d keypoints is not synchronized with 2d joints, all of the 3d joints (whole body) will still be learned, is there any reason behind it? Are you meaning to give the network to infer the 'invisible' part of the body?

Arthur151 / ROMP

Still learning 3D keypoints if even the person is truncated in the image? #191