Open ngoductuanlhp opened 3 months ago
If you want to generate dense tracking annotations, one naive method can be using the depth map from frame i, unprojecting it to get the 3D point clouds, and reprojecting 3D points to another frame j to find correspondences by comparing the projected depth value with depth map from frame j. But this only works for static objects. So I think the decent way to do this is by rendering some ID map, where each pixel value is a mesh face ID. In that way, you could find the correspondence between pixels if they share the same ID.
Hi @y-zheng18, thank you for the great work!
When extracting the tracking annotations, as far as I understand, you subsample a set of vertices on the scene and obj mesh, then project them to each frame to obtain a set of 2D/3D tracking annotations.
Would it be possible to generate dense tracking annotations for each pixel of the first frame of the video? A naive approach is to find the corresponding mesh's vertex for each pixel of the first frame but I'm not sure is there any better way to do that.
Thank you.