I am wondering if we are using the results of 2D detector on the training of the pose estimation model. It is little bit confusing because from the paper, I thought we will use the results of 2D detector and then use it to select each object candidate. But, from reading the code, it seems we do that only while inferencing (testing), but in training, we don't need 2D object detectors. Is my understanding correct?
I am wondering if we are using the results of 2D detector on the training of the pose estimation model. It is little bit confusing because from the paper, I thought we will use the results of 2D detector and then use it to select each object candidate. But, from reading the code, it seems we do that only while inferencing (testing), but in training, we don't need 2D object detectors. Is my understanding correct?
Thank you for your help!