NVlabs / FoundationPose

[CVPR 2024 Highlight] FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects
https://nvlabs.github.io/FoundationPose/
Other
1.52k stars 209 forks source link

A question about model training #134

Closed byran-wang closed 6 months ago

byran-wang commented 6 months ago

Thank you so much for your excellent work! I have run this code successfully, and the demo is amazing.

I have some questions about the model training. There are two models in FoundationPose: one for pose refinement and another for pose selection. In the pose estimation stage, the pose refinement model samples 252 camera points. Do these 252 sampled camera points still exist in the model training? From my perspective, there are still 252 sampled camera points because they need to supply pose hypotheses to the pose selection stage.

Looking forward to your reply very much. Thank you again for your work.

wenbowen123 commented 6 months ago

the sampled poses at test time can be arbitrary. During training, we randomly sample the poses. So we didn't keep them consistent between training and testing. Though if we keep the consistency between the training and testing, it might help a bit I guess.

byran-wang commented 6 months ago

OK, I get it. Thanks.