tjiiv-cprg / EPro-PnP

[CVPR 2022 Oral, Best Student Paper] EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation
https://www.youtube.com/watch?v=TonBodQ6EUU
Apache License 2.0
1.1k stars 105 forks source link

Training on own dataset not fully successful #15

Open krullgit opened 2 years ago

krullgit commented 2 years ago

Hi, thanks for the repo and paper, seriously good work! I trained the network on my own data and maybe you have an idea where to look to improve performance, judged on these results. Especially ADD and ARP are super low. Any idea? I suspect that I probably did something wrong with data preparation since the training on the LM data works great.

[0708 15:57:00 @eval.py:75] ------------ x ----------- [0708 15:57:00 @eval.py:76] [rot_thresh, trans_thresh: RotAcc, TraAcc, SpcAcc [0708 15:57:00 @eval.py:77] average_accuracy[-1, -1.00]: 74.14, 80.30, 69.60 [0708 15:57:00 @eval.py:84] average_accuracy[ 2, 0.02]: 48.48, 70.71, 43.43 [0708 15:57:00 @eval.py:84] average_accuracy[ 5, 0.05]: 82.83, 83.84, 78.79 [0708 15:57:00 @eval.py:84] average_accuracy[10, 0.10]: 90.91, 91.92, 88.89

[0708 15:57:00 @eval.py:92] ---------- performance over 1 classes ----------- [0708 15:57:00 @eval.py:93] [rot_thresh, trans_thresh: RotAcc, TraAcc, SpcAcc [0708 15:57:00 @eval.py:95] average_accuracy[-1, -1.00]: 74.14, 80.30, 69.60 [0708 15:57:00 @eval.py:104] average_accuracy[ 2, 0.02]: 48.48, 70.71, 43.43 [0708 15:57:00 @eval.py:104] average_accuracy[ 5, 0.05]: 82.83, 83.84, 78.79 [0708 15:57:00 @eval.py:104] average_accuracy[10, 0.10]: 90.91, 91.92, 88.89

[0708 15:57:00 @eval.py:120]

Lakonik commented 2 years ago

Hi! Thank you for your interest in our work.

I think it is possibly related to data preparation, OR, perhaps your data is vastly different from LineMOD so the hyperparameters or evaluation metrics might not work well. For example, if your objects have a much larger scale or are very distant, then the current evaluation metrics might be too strict.

krullgit commented 2 years ago

Thanks! I found a problem in my coord files and ADD looks good now. However, ARP_2d is still zero (threshold=20, correct poses: 1.0, all poses: 99.0, accuracy: 1.01). Can it have something to do with the norm_factor which doesn't go up? Here is, how it looks right now (orange is LM training and grey is mine): Screenshot from 2022-07-09 15-03-49

Lakonik commented 2 years ago

I think this norm_factor is OK. It's rough measure of how concentrated the predicted pose distributions are, so lower norm_factor indicates higher uncertainty, possibly because your dataset is more challenging.

petrock99 commented 1 year ago

@krullgit, how did you generate your own dataset? Specifically, how did you generate the coor.pkl files that the algorithm uses as the ground-truth for each object? Or did you distill it down somehow to what the algorithm actually needs? I'm having trouble generating them for my own dataset. Thanks.

krullgit commented 1 year ago

The coor pkl files contain the point cloud of the object itself :) So assuming you have an image with the according depth per pixel and a segmentation map, you first have to mask the depth map with the segmentation map. Now, you have the depth where the object is. Next you have to normalize all the remaining depth values (x,y,z) with the center of the object. Or in other words, you substract the center point from all other 3d points. Now, every point from the object is given in relation to the center of that object. Thats the pkl file. Hope that helps

guan2000910 commented 1 year ago

@krullgit, Hello, I would like to run the code for this project on an industrial dataset, can you provide any thoughts on getting your own dataset to run successfully? And exactly how did you implement the generation of the pkl file? Thank you!