This research project implements a real-time object detection and pose estimation method as described in the paper, Tekin et al. "Real-Time Seamless Single Shot 6D Object Pose Prediction", CVPR 2018. (https://arxiv.org/abs/1711.08848).
According to README, I made my own dataset using ObjectDatasetsTools and modified the attributes in obj.data (such as camera parameters, diameters, etc.) and got the following results via train.py:
-----------------------------------
tensor to cuda : 0.000346
forward pass : 0.012606
get_region_boxes : 0.003043
prediction time : 0.015995
eval : 0.112702
-----------------------------------
2023-02-15 11:57:06 Results of deli
2023-02-15 11:57:06 Acc using 5 px 2D Projection = 91.13%
2023-02-15 11:57:06 Acc using 10% threshold - 0.007801272195673763 vx 3D Transformation = 71.29%
2023-02-15 11:57:06 Acc using 5 cm 5 degree metric = 95.09%
2023-02-15 11:57:06 Mean 2D pixel error is 2.552845, Mean vertex error is 0.006240, mean corner error is 4.039984
2023-02-15 11:57:06 Translation error: 0.006158 m, angle error: 2.358766 degree, pixel error: 2.552845 pix
But when I predict an image from the same camera in real time, the results are always bad.I'm not sure where is the problem, you can provide some help?
According to README, I made my own dataset using
ObjectDatasetsTools
and modified the attributes in obj.data (such as camera parameters, diameters, etc.) and got the following results viatrain.py
:But when I predict an image from the same camera in real time, the results are always bad.I'm not sure where is the problem, you can provide some help?