NVlabs / FoundationPose

[CVPR 2024 Highlight] FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects
https://nvlabs.github.io/FoundationPose/
Other
955 stars 99 forks source link

Clarification Needed on Required Data Files for Training Similar to YCB Dataset #73

Closed Sar-thak-3 closed 4 weeks ago

Sar-thak-3 commented 4 weeks ago

Description:

I'm currently working on training a model using a dataset similar to the YCB dataset for object recognition and pose estimation tasks in computer vision. The dataset includes RGB images, depth images, CAD models of objects, and camera parameters. However, I've encountered additional h5 and pkl files that seem to be necessary for training, but I'm unsure about their contents and necessity. These files are reuired in PairH5Dataset.

https://github.com/NVlabs/FoundationPose/blob/8395bd84adb0bfc3209cb9409e475cd1b51fc17b/learning/datasets/h5_dataset.py#L36 https://github.com/NVlabs/FoundationPose/blob/8395bd84adb0bfc3209cb9409e475cd1b51fc17b/learning/datasets/h5_dataset.py#L53

Problem:

  1. Unclear Contents: I'm uncertain about the contents of the h5 and pkl files required for training. Specifically, what kind of data do these files contain, and how are they utilized during the training process?
  2. Necessity: I'm also unsure about the necessity of these files. Are they essential for training the model effectively, or are they optional? Understanding their role in the training pipeline would greatly help me in organizing and preparing my dataset.
wenbowen123 commented 4 weeks ago

Hi, the training part is not supported. This foundation model can be immediately applied for novel objects.