I was working on the 2D obj detection, e.g. yolov8. So I know little about the monocular 3D obj detection model.
As a result, it seems that the relationship between the dataset and model in 3D det domain is complex for me...
To inference the image, if I'd like to use my own dataset (gain from other monocular) or the real-time monocular on vehicle instead of inferencing the KITTI dataset.
Must I annotate my own dataset according to KITTI format? What if the fixed position of camera is changed? Will the model work?
Hello, author!
I was working on the 2D obj detection, e.g. yolov8. So I know little about the monocular 3D obj detection model.
As a result, it seems that the relationship between the dataset and model in 3D det domain is complex for me...
To inference the image, if I'd like to use my own dataset (gain from other monocular) or the real-time monocular on vehicle instead of inferencing the KITTI dataset.
Must I annotate my own dataset according to KITTI format? What if the fixed position of camera is changed? Will the model work?