y-zheng18 / point_odyssey

Official code for PointOdyssey: A Large-Scale Synthetic Dataset for Long-Term Point Tracking (ICCV 2023)
110 stars 5 forks source link

About extracting dense tracking #15

Open ngoductuanlhp opened 1 month ago

ngoductuanlhp commented 1 month ago

Hi @y-zheng18, thank you for the great work!

When extracting the tracking annotations, as far as I understand, you subsample a set of vertices on the scene and obj mesh, then project them to each frame to obtain a set of 2D/3D tracking annotations.

Would it be possible to generate dense tracking annotations for each pixel of the first frame of the video? A naive approach is to find the corresponding mesh's vertex for each pixel of the first frame but I'm not sure is there any better way to do that.

Thank you.

y-zheng18 commented 1 month ago

If you want to generate dense tracking annotations, one naive method can be using the depth map from frame i, unprojecting it to get the 3D point clouds, and reprojecting 3D points to another frame j to find correspondences by comparing the projected depth value with depth map from frame j. But this only works for static objects. So I think the decent way to do this is by rendering some ID map, where each pixel value is a mesh face ID. In that way, you could find the correspondence between pixels if they share the same ID.