microsoft / singleshotpose

This research project implements a real-time object detection and pose estimation method as described in the paper, Tekin et al. "Real-Time Seamless Single Shot 6D Object Pose Prediction", CVPR 2018. (https://arxiv.org/abs/1711.08848).
MIT License
720 stars 215 forks source link

Run train_ mutil.py about some problems encountered in custom datasets #174

Open xiaoxiongmaoxuexi opened 2 years ago

xiaoxiongmaoxuexi commented 2 years ago

Hello, I have a problem. When I use my own data set for multi-target detection, the accuracy of training is 0. But objects can be recognized normally. Do you know why? Thank you for your answer. Is the label used for single target detection and multi-target detection。 The following phenomena appear in training. But the official data set training is normal. 2022-03-15 15:02:37 Testing cup... 2022-03-15 15:02:37 Number of test samples: 895 2022-03-15 15:03:06 Acc using 5 px 2D Projection = 0.00% 2022-03-15 15:03:06 Acc using 10 px 2D Projection = 0.00% 2022-03-15 15:03:06 Acc using 15 px 2D Projection = 0.00% 2022-03-15 15:03:06 Acc using 20 px 2D Projection = 0.00% 2022-03-15 15:03:06 Acc using 25 px 2D Projection = 0.00% 2022-03-15 15:03:06 Acc using 30 px 2D Projection = 0.00% 2022-03-15 15:03:06 Acc using 35 px 2D Projection = 0.00% 2022-03-15 15:03:06 Acc using 40 px 2D Projection = 0.00% 2022-03-15 15:03:06 Acc using 45 px 2D Projection = 0.67% 2022-03-15 15:03:06 Acc using 50 px 2D Projection = 1.12% 2022-03-15 15:03:06 ----------------------------------- 2022-03-15 15:03:06 tensor to cuda : 0.000142 2022-03-15 15:03:06 predict : 0.003501 2022-03-15 15:03:06 get_region_boxes : 0.026212 2022-03-15 15:03:06 eval : 0.001130 2022-03-15 15:03:06 total : 0.030986 2022-03-15 15:03:06 ----------------------------------- 2022-03-15 15:03:06 Testing sugar... 2022-03-15 15:03:06 Number of test samples: 480 2022-03-15 15:03:18 Acc using 5 px 2D Projection = 0.00% 2022-03-15 15:03:18 Acc using 10 px 2D Projection = 0.00% 2022-03-15 15:03:18 Acc using 15 px 2D Projection = 0.00% 2022-03-15 15:03:18 Acc using 20 px 2D Projection = 0.00% 2022-03-15 15:03:18 Acc using 25 px 2D Projection = 0.00% 2022-03-15 15:03:18 Acc using 30 px 2D Projection = 0.00% 2022-03-15 15:03:18 Acc using 35 px 2D Projection = 0.42% 2022-03-15 15:03:18 Acc using 40 px 2D Projection = 2.92% 2022-03-15 15:03:18 Acc using 45 px 2D Projection = 19.58% 2022-03-15 15:03:18 Acc using 50 px 2D Projection = 32.50% 2022-03-15 15:03:18 ----------------------------------- 2022-03-15 15:03:18 tensor to cuda : 0.000140 2022-03-15 15:03:18 predict : 0.003520 2022-03-15 15:03:18 get_region_boxes : 0.019432 2022-03-15 15:03:18 eval : 0.001209 2022-03-15 15:03:18 total : 0.024300 2022-03-15 15:03:18 ----------------------------------- 2022-03-15 15:03:18 Testing driller... 2022-03-15 15:03:19 Number of test samples: 1009 2022-03-15 15:03:51 Acc using 5 px 2D Projection = 0.00% 2022-03-15 15:03:51 Acc using 10 px 2D Projection = 0.00% 2022-03-15 15:03:51 Acc using 15 px 2D Projection = 0.00% 2022-03-15 15:03:51 Acc using 20 px 2D Projection = 0.10% 2022-03-15 15:03:51 Acc using 25 px 2D Projection = 0.20% 2022-03-15 15:03:51 Acc using 30 px 2D Projection = 1.09% 2022-03-15 15:03:51 Acc using 35 px 2D Projection = 3.57% 2022-03-15 15:03:51 Acc using 40 px 2D Projection = 10.70% 2022-03-15 15:03:51 Acc using 45 px 2D Projection = 21.21% 2022-03-15 15:03:51 Acc using 50 px 2D Projection = 36.17% 2022-03-15 15:03:51 ----------------------------------- 2022-03-15 15:03:51 tensor to cuda : 0.000172 2022-03-15 15:03:51 predict : 0.005028 2022-03-15 15:03:51 get_region_boxes : 0.020104 2022-03-15 15:03:51 eval : 0.003032 2022-03-15 15:03:51 total : 0.028337 2022-03-15 15:03:51 ----------------------------------- 2022-03-15 15:03:51 Testing duck... 2022-03-15 15:03:52 Number of test samples: 1065 2022-03-15 15:04:35 Acc using 5 px 2D Projection = 0.00% 2022-03-15 15:04:35 Acc using 10 px 2D Projection = 0.00% 2022-03-15 15:04:35 Acc using 15 px 2D Projection = 0.19% 2022-03-15 15:04:35 Acc using 20 px 2D Projection = 1.22% 2022-03-15 15:04:35 Acc using 25 px 2D Projection = 6.48% 2022-03-15 15:04:35 Acc using 30 px 2D Projection = 12.39% 2022-03-15 15:04:35 Acc using 35 px 2D Projection = 14.84% 2022-03-15 15:04:35 Acc using 40 px 2D Projection = 15.87% 2022-03-15 15:04:35 Acc using 45 px 2D Projection = 15.96% 2022-03-15 15:04:35 Acc using 50 px 2D Projection = 16.15% 2022-03-15 15:04:35 ----------------------------------- 2022-03-15 15:04:35 tensor to cuda : 0.000231 2022-03-15 15:04:35 predict : 0.005087 2022-03-15 15:04:35 get_region_boxes : 0.032968 2022-03-15 15:04:35 eval : 0.002247 2022-03-15 15:04:35 total : 0.040533 2022-03-15 15:04:35 ----------------------------------- 2022-03-15 15:04:35 Testing glue... 2022-03-15 15:04:35 Number of test samples: 1036 2022-03-15 15:05:07 Acc using 5 px 2D Projection = 0.00% 2022-03-15 15:05:07 Acc using 10 px 2D Projection = 0.00% 2022-03-15 15:05:07 Acc using 15 px 2D Projection = 0.10% 2022-03-15 15:05:07 Acc using 20 px 2D Projection = 1.25% 2022-03-15 15:05:07 Acc using 25 px 2D Projection = 4.83% 2022-03-15 15:05:07 Acc using 30 px 2D Projection = 9.27% 2022-03-15 15:05:07 Acc using 35 px 2D Projection = 14.38% 2022-03-15 15:05:07 Acc using 40 px 2D Projection = 19.98% 2022-03-15 15:05:07 Acc using 45 px 2D Projection = 25.00% 2022-03-15 15:05:07 Acc using 50 px 2D Projection = 28.67%

xiaoxiongmaoxuexi commented 2 years ago

The cup is a dataset I made myself. Thank you for your answer, thank you.

xiaoxiongmaoxuexi commented 2 years ago

Another problem is that when the trained model is visualized, it can be recognized in the context of and making data sets. When I am in other contexts, the recognition effect is very poor or even unrecognizable.

zczczdc commented 2 years ago

Hey, did you succeed in printing the bounding boxes for multiple objects in a single image? I try to run the source program valid.py and the visualization works, but the source program valid_muti.py cannot be displayed visually. Have you ever encountered such a problem? Thanks for your answer

asd646942825 commented 2 years ago

Why is the glue so effective in your multi-target recognition? When I trained, I couldn't train the original author at all. Is your dataset loading slow? Thank you for your reply