NVlabs / FoundationPose

[CVPR 2024 Highlight] FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects
https://nvlabs.github.io/FoundationPose/
Other
1.34k stars 173 forks source link

Question regarding coordinate frame using colmap and error [WARNING] batch has 0 intersections!! #194

Open dhruvmsheth opened 1 month ago

dhruvmsheth commented 1 month ago

Hello,

Thanks for the work done here! I appreciate your contributions. I have some questions regarding the coordinate frame used to train custom nerf model using run_nerf.py. I used the script provided by InstantNGP: https://github.com/NVlabs/instant-ngp/blob/master/scripts/colmap2nerf.py to obtain the camera pose estimates of the captured data. I used the flag --keep_colmap_coords to keep the data in colmap frame since it uses the opencv convention. From what I saw in some issues, FoundationPose seems to be using the opencv convention during input and then later processes it into the opengl convention. This means that the pose being fed can be in colmap format for training the nerf. However, I get the error [WARNING] batch has 0 intersections!! while training and the naive_fusion.ply file also contains completely erroneous results (attached below). Would you have any idea on what the appropriate transformation from colmap frame to the frame convention accepted by foundationpose be? I have attached the naive_fusion.ply file below to inspect. Thank you! Let me know if any other information is required.

image naive_fusion_colmap.zip

dhruvmsheth commented 1 month ago

@wenbowen123 Would greatly appreciate your insight on this. Have tried various transformations with no avail.

dhruvmsheth commented 1 month ago

I realized that my output was already in opengl format, so I changed the line that transforms the opencv input coordinates to opengl coordinates from https://github.com/NVlabs/FoundationPose/blob/ae8dd0c0108113cac2c6ea0490381de495f39b77/bundlesdf/run_nerf.py#L23 to glcam_in_obs = cam_in_obs. However, the pose estimation outputs from colmap2nerf.py script by instantNGP have an incorrect scaling and don't match the depth of the object from the depth maps which causes this issue. I know some methods to do it in a crude format but it might not be as accurate as I want it to be. Would you know any way to incorporate the depth scaling into the colmap pose estimates?

wenbowen123 commented 3 weeks ago

Hi, sorry for the late reply, is your reference images RGB-only? Did you use a phone to capture?