ruhyadi / YOLO3D

YOLO 3D Object Detection for Autonomous Driving Vehicle
https://ruhyadi.github.io/project/computer-vision/yolo3d
228 stars 40 forks source link

Inference on highway scene #9

Closed pdd-vn closed 2 years ago

pdd-vn commented 2 years ago

I have tried the model (both your and my trained model) on highway scene (1 from the testing set and the other from google images) and the inference result is kinda bad. Do you have any ideas how I can improve this?

ruhyadi commented 2 years ago

Of course the results are not very good. this is because the training dataset that I use is the KITTI dataset. Where in the dataset, the image is only captured from the top of the car dashboard. So it is assumed that the object has no pitch and roll (zero pitch and roll), relative to the camera. Also, does the image you use have a calibration file?, I don't think so. So the model cannot describe the bounding-box accurately.

pdd-vn commented 2 years ago

@ruhyadi Let put the calibration problem aside, what I am talking about is the bounding box shape. In here, the 3D box shape seems to be correct. But in here, the 3D box shape is unacceptable, even the view point is somewhat similar to KITTI samples.

ruhyadi commented 2 years ago

I sincerely apologize for the late answer. I agree that the 3D bounding box does not properly enclose the object. I use KITTI training data which has dimensions of 1382 x 512 pixels. It would be great if you also use the same dimensions when doing the inference. Also, 3D bounding-box detection based on 2D bounding-box method is not very accurate. You can use other methods such as keypoint estimation which is then regressed into a 3D boundingbox. Please check Banconxuan/RTM3D