jcorsetti / oryon

Official implementation of CVPR24 Highlight paper "Open-vocabulary object 6D pose estimation"
20 stars 2 forks source link

Inquiry about experiment details #4

Closed plusgrey closed 3 months ago

plusgrey commented 4 months ago

Hi Jaime,

Thanks for your great work. Could you please share how many GPUs are used and what type of GPU/TPU you use for training.

There is one more small question, since the oryon is able to estimate postition of an object with a pose, is it possible to formulate it as a tracking model? For example, using the first frame of a video as the anchor and the rest frames are queries.

plusgrey commented 4 months ago

Moreover, when I run the training code. It shows a missing file error image

I also found the dataset name of this line (https://github.com/jcorsetti/oryon/blob/81f676725ad90207c2ad2007dd843459df220395/prepare_sn6d.sh#L8) in the prepare_sn6d.sh should be shapenet6d.

jcorsetti commented 4 months ago

Thanks for your interest and sorry for the late reply. For training we used four Nvidia V100 GPU as detailed in Section 3.3 of the supplementary material, you can also check out that section for the execution times. About the tracking question: yes, Oryon could definitely be applied in this context without changing the formulation, but I think that retraining on a dataset with same-scene pairs would be necessary. Thanks for pointing out the two errors, I should have fixed them in the last commits.

jcorsetti commented 3 months ago

Feel free to reopen the issue if you have more questions about this.