Open WSTao opened 3 years ago
Hi, the problem you mentioned is indeed very difficult to solve, cause 3D based detection method will not work with different camera image data especially the depth estimation. And a solution is recover depth based on focal length, or use 2D based detection method which not predict 3D location directly such as YOLO3D.
Thanks What are the advantages of the latest end-to-end model(Smoke) over the previous version of yolo3d?
Calculating the center point and distance through 2D information requires some prior and hypothetical information,and in fact, such releationships can learned from data(enough data). So end-to-end model can get higher performance in specific dataset.
The training of this model requires the same internal and external parameters of the camera. If they are different, do you need to train separately or recreate the data set? In this way, I cannot mix the Kitti dataset and the Nuscenes dataset for training?
I am not familiar with Nuscenes image data but the resolution of kitti data has big difference with our camera, maybe you can train model on Nuscenes only or try waymo image data...
OK,thank you
Monocular 3D depends on camera parameters. If you change a different camera or installation method, the original DataSet training model will not work. So how can you solve this difference