NVlabs / Deep_Object_Pose

Deep Object Pose Estimation (DOPE) – ROS inference (CoRL 2018)
Other
1.03k stars 287 forks source link

Underfitting on Training Data #379

Closed langelo1 closed 3 months ago

langelo1 commented 3 months ago

I've generated ~1300 views evenly sampled around my object (symmetric about 1-axis), trained the model for 140 epochs with a batch size of 16, learning rate at 6e-5 (reduced by 10x at 90 and 120). I'm getting lackluster performance over all, although certain aspects to the object do well. Given the relatively small amount of training data, I would've expected that the model might overfit to the data. But it appears the opposite is the case, since the performance on the training data itself is also not very good.

For debugging purposes, I'm wondering if it should be possible to overfit the model to a small amount of training data, such that the model performs flawlessly on the training data. For example, if I have 162 evenly sampled views around my object, should the model be able to "memorize" and thus over fit to this data? If that should be possible, then the fact that I can't get the model to overfit would indicate that there might be a problem with my training process (either with the annotations or perhaps hyper parameters).

I'm also wondering about the general performance vs CenterPose: will CenterPose perform better or worse on pose estimation for a single object? The category flexibility of CenterPose is nice, but will DOPE perform better on pose estimation for a single object?

TontonTremblay commented 3 months ago

can you share some examples of your data, can show use some belief maps during training? In my testing, DOPE performs better than centerpose for a single model.

langelo1 commented 3 months ago

Thanks for the reply. I don't have the data online at the moment, however its ray traced imagery of a single object produced in Blender using a few different HDRI environment maps. Procedural noise is added to the textures and the variance of metallic/roughness is set quite high. I was just curious if there might be an expected fitting performance on a small training set. I wouldn't expect good performance on any unseen images in this case, but if I can achieve the expected performance on the a small training set then I would be confident that scaling up is the next step, rather than further debugging the current pipeline.

TontonTremblay commented 3 months ago

Might just need more training! It takes a while to train tbh.

langelo1 commented 3 months ago

Appreciate the support, thanks!

TontonTremblay commented 3 months ago

You could test on a single image to see the results, even on a single image, it takes like 1h on a 3090.

langelo1 commented 3 months ago

More training helped, I underestimated how much was required, thanks.