3d_coords to pixel_coords transformation

rohitgirdhar / CATER

CATER: A diagnostic dataset for Compositional Actions and TEmporal Reasoning

Apache License 2.0

103 stars 19 forks source link

Thanks for your interest. Unfortunately that is not trivial to my understanding. We are able to do it for the first frame for static camera setup, when initializing the tracker baseline (we manually computed the homography between the ground plane and camera plane, and used it to transform the ground/world 3d coordinates to camera plane 2d coordinates), but it only applies when the object is on the ground plane, and not floating in the air. Moreover, moving camera would further complicate things.

I think the most robust way to solve the problem would be to re-render the data and store some sort of segmentation maps for objects, which should help with accurate localization of each object in the image plane.

rohitgirdhar / CATER

3d_coords to pixel_coords transformation #9