NVlabs / FoundationPose

[CVPR 2024 Highlight] FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects
https://nvlabs.github.io/FoundationPose/
Other
955 stars 99 forks source link

Training Data - How to use? #99

Closed lashout314 closed 21 hours ago

lashout314 commented 2 weeks ago

@wenbowen123 Thanks for your great work! I'm trying to fine-tune the Refinement and Selection models to use RGB input only using the training data you provided.

It appears the training dataset consists of pairs of renders of the same scene with bbox, K, distance to image plane, instance segmentation, and RGB render. Do you utilize one of the pair as the rendering and the other as the input observation for the pose refinement training?

It would be great if you could describe how the training dataset you provide is used.

wenbowen123 commented 1 week ago

Hi, each scene is independent. To make the training pairs, you need to implement this yourself, by following the paper. We do not support the training part in the repo at the moment.