liuyuan-pal / Gen6D

[ECCV2022] Gen6D: Generalizable Model-Free 6-DoF Object Pose Estimation from RGB Images
GNU General Public License v3.0
604 stars 75 forks source link

Something about improvement #86

Open 1gjjuser1 opened 1 year ago

1gjjuser1 commented 1 year ago

Hello author, I have two questions to ask you:

  1. As you said in your paper, it takes about 0.64s to process each photo. Obviously, this cannot meet the real-time requirements. If I want to improve, where should I work hard, knowledge distillation, quantization, pruning, TensorRT and other technologies, which solution is feasible?
  2. In addition to real-time requirements, if you want to improve the accuracy of detection, where should you work hard? Looking forward to your reply, thank you author.
EvdoTheo commented 1 year ago

Hello author, I have two questions to ask you:

  1. As you said in your paper, it takes about 0.64s to process each photo. Obviously, this cannot meet the real-time requirements. If I want to improve, where should I work hard, knowledge distillation, quantization, pruning, TensorRT and other technologies, which solution is feasible?
  2. In addition to real-time requirements, if you want to improve the accuracy of detection, where should you work hard? Looking forward to your reply, thank you author.

I'm also curious about the questions you raised and the subsequent suggestions of the author. I am trying to deploy GEn6D for real-time scenarios on custom objects, but the results are inadequate.

AmokraneIlhem commented 1 year ago

Hello author, I have two questions to ask you:

  1. As you said in your paper, it takes about 0.64s to process each photo. Obviously, this cannot meet the real-time requirements. If I want to improve, where should I work hard, knowledge distillation, quantization, pruning, TensorRT and other technologies, which solution is feasible?
  2. In addition to real-time requirements, if you want to improve the accuracy of detection, where should you work hard? Looking forward to your reply, thank you author.

I'm also asking myself the same questions!

liuyuan-pal commented 1 year ago

Hi. Since the detection and viewpoint selection are only used in the initialization, I think the refinement is much more important for both accuracy and efficiency. Actually, I don't have any good ideas about further improving the efficiency and accuracy of the refiner, which I think could be a promising direction to publish a new paper.