Closed Xuehao-Gao closed 1 year ago
Hi,
The 2D to 3D motion prediction was introduced by [1]. The authors designed the problem to be:
Thus, the visualizations you see in figures 4 & 5 are in camera frame. While the visualization you see in figure 1 is in world frame., and this is more of an art than a result. The camera parameters can be found in the original dataset files for both GTA-IM and PROX in case you needed to do the 3D visualization in the world frame. Other than this, using the visualization script in our repo will allow you directly to visualize in the camera frame generating the same figures as figures 4&5.
[1] Cao, Zhe, Hang Gao, Karttikeya Mangalam, Qi-Zhi Cai, Minh Vo, and Jitendra Malik. "Long-term human motion prediction with scene context." In European Conference on Computer Vision, pp. 387-404. Springer, Cham, 2020.
Hi,
I saw that you predict the future 3d human poses in the current frame's camera coordinate. My question is that the predicted future 3D motion results shown in your paper are in the world coordinate, so we need additional camera future positions to transform this coordinate system. Actually, this transformation matrix or camera position information is unavailable for your method in practice. Hence, how can your method predict these future human motion in the 3D scene?