Closed pdd-vn closed 2 years ago
Of course the results are not very good. this is because the training dataset that I use is the KITTI dataset. Where in the dataset, the image is only captured from the top of the car dashboard. So it is assumed that the object has no pitch and roll (zero pitch and roll), relative to the camera. Also, does the image you use have a calibration file?, I don't think so. So the model cannot describe the bounding-box accurately.
I sincerely apologize for the late answer. I agree that the 3D bounding box does not properly enclose the object. I use KITTI training data which has dimensions of 1382 x 512 pixels. It would be great if you also use the same dimensions when doing the inference. Also, 3D bounding-box detection based on 2D bounding-box method is not very accurate. You can use other methods such as keypoint estimation which is then regressed into a 3D boundingbox. Please check Banconxuan/RTM3D
I have tried the model (both your and my trained model) on highway scene (1 from the testing set and the other from google images) and the inference result is kinda bad. Do you have any ideas how I can improve this?