NVlabs / Deep_Object_Pose

Deep Object Pose Estimation (DOPE) – ROS inference (CoRL 2018)
Other
1.03k stars 287 forks source link

Regarding generating single inference for multiple objects #285

Open ArghyaChatterjee opened 1 year ago

ArghyaChatterjee commented 1 year ago

Hello,

I was trying to generate a single inference file for multiple objects. I couldn't find any reference inside your document in the repo. Can I do that ? Or I have to stick to training single inference file for every object and then infer those (all) at once to detect the object in real time.

(N.B: I tried to infer 8 models to detect 8 objects at a time but the network crashed saying core cumped. That's why I am asking if I can create a single inference for all the objects I want to detect in there scene.)

TontonTremblay commented 1 year ago

Yeah you can do that. The way to do it is by having uncomment weights in the config file.

ArghyaChatterjee commented 1 year ago

Hi @TontonTremblay , thanks for the reply. I think you didn't understand my question. I already did that. My question is, I want to create a single inference (weight) file to detect and pose estimate of the objects of interest in the scene. I don't want to train separate models for separate objects and then try to infer them all at once. Say, I have 30 objects to detect in the scene for the manipulation task. I want a single inference file after training which when inferred will give me 30 objects label with the pose estimated (just like YOLO).

N.B: I tried to infer 8 weight files in order to detect 8 objects with pose estimates and the network collapsed after some times saying aborted due to core dumped.

ArghyaChatterjee commented 1 year ago

@TontonTremblay is there a way to do that ?

TontonTremblay commented 1 year ago

I will refer you to our other work, centerpose. https://github.com/NVlabs/CenterPose. Yeah Dope is not designed to do multiple instances. You probably want to look into centerpose.

ArghyaChatterjee commented 1 year ago

Hi @TontonTremblay ,

Thanks for the reply. There are 3 questions I have and probably want to finish this discussion here.

  1. Have you done any quantitative analysis on how many models dope can detect simultaneously? I mean say in a 2080Ti/3080 Ti GPU Core i7 pc with 32/64 GBs of RAM, how many different types of objects DOPE can detect without crashing at the same time? Or any rough idea?

  2. When I went inside CenterPose, I can see that CenterPose is also inferring individual categories at a time ( I mean say 10 inference files for 10 categories). Isn't this the same approach as DOPE? Is this an efficient approach given the fact that if you want to detect 30 labels, you need 30 inference files and so ?

  3. Also, CenterPose is not outputting the individual category labels, so how will you understand which category it has detected when inferred in real-time with 30 inference files?

Thanks in advance.

ArghyaChatterjee commented 1 year ago

@TontonTremblay can you respond please?

TontonTremblay commented 1 year ago

Sorry I had an ICCV submission + some RSS reviews and lost track of my emails. sorry.

  1. I did spam and mustard once and that was years ago. Did not check numbers. But I had the results in different heatmaps. eg. 9x2 and 16x2 for both heatmap and vector fields.
  2. I never tested something like you are describing. I think posecnn is more along the line (I would not suggest you to test it out, I would be surprised it would work).
  3. I think we could probably add a generic category for centerpose.

Do you have some examples of objects you would like to have process together. I am not sure what you are describing, but I think 30 different objects in a single set of weights might be quite hard. Have you looked into Megapose? A new work we did, if you have 3d models it might help.