facebookresearch / OrienterNet

Source Code for Paper "OrienterNet Visual Localization in 2D Public Maps with Neural Matching"
Other
424 stars 39 forks source link

Generalize to Nuscenes is not good #26

Open SiYLin opened 9 months ago

SiYLin commented 9 months ago

Hi, Thx for the great work. I recently have tried to apply your work on various autonomous driving dataset and find out the performance is way lower compare to the information you gave in paper such as KITTI, under same max init error. E.g. The position XY 5m recall is only around 40% under 32 init error.

sarlinpe commented 9 months ago
  1. Do you properly gravity-rectify and undistort the images? Do the gravity estimates look sufficiently accurate?
  2. Are you sure that the ground truth poses of these datasets are sufficiently accurate for such evaluation?
SiYLin commented 9 months ago

Thank you for your response. I appreciate your diligence in this matter. Upon my initial interpretation, I may have made a mistake.

I conducted tests on the KITTI dataset and the results aligned well with the data provided in your paper. However, when I applied the same tests to the Nuscenes dataset, I observed a significant drop in the XY position (5m), which fell to 48% recall under a maximum initial error of 32m.

I have verified the roll pitch that we used to rectify the camera (similar to the process what you used in KITTI dataset, pixel to camera, camera to vehicle, and vehicle to world). For my training, I only use the front camera data.

sarlinpe commented 9 months ago
  1. So you actually do retrain the model on the Nuscenes dataset? Do you initialize it with the model trained on Mapillary? How large is the dataset? Are you sure that it is not simply overfitting?
  2. I am not familiar with the Nuscenes dataset, but it may simply be more difficult than KITTI if it has sparser distinctive semantic elements (more data from highways, less from city & residential areas) or if OSM is of lower quality in these areas (incorrect or fewer labels).
SiYLin commented 9 months ago
  1. Yes, I retrained on Nuscenes.
  2. No,I trained model from scratch. The nuscenes dataset containes around 30K training samples and 6k validation samples.
sarlinpe commented 9 months ago

This is likely too small. If the training and validation data are in disjoint areas, you should see a clear overfitting.

SiYLin commented 9 months ago

Thx for your replying! It does overfit the training dataset. The recall half meters is 100% in training dataset....May I know how much data in general to train this model?

sarlinpe commented 9 months ago

The more the better. Try initializing your model with the pre-trained Mapillary model that we provide.

jike5 commented 4 months ago

Hi, Thx for the great work. I recently have tried to apply your work on various autonomous driving dataset and find out the performance is way lower compare to the information you gave in paper such as KITTI, under same max init error. E.g. The position XY 5m recall is only around 40% under 32 init error.

Hello, SiYlin, I've also been trying to migrate Orienternet to NuScenes recently. I wanted to ask you how you're getting GPS information. As far as I know, NuScenes only provides ground truth pose and doesn't include raw GPS information with noise. Thank you very much!

SiYLin commented 4 months ago

Hi, Thx for the great work. I recently have tried to apply your work on various autonomous driving dataset and find out the performance is way lower compare to the information you gave in paper such as KITTI, under same max init error. E.g. The position XY 5m recall is only around 40% under 32 init error.

Hello, SiYlin, I've also been trying to migrate Orienternet to NuScenes recently. I wanted to ask you how you're getting GPS information. As far as I know, NuScenes only provides ground truth pose and doesn't include raw GPS information with noise. Thank you very much!

Hi : You can try to add some random noise on the ground truth pose to fake the scenario in which you have noise GPS information.

jike5 commented 4 months ago

Hi, Thx for the great work. I recently have tried to apply your work on various autonomous driving dataset and find out the performance is way lower compare to the information you gave in paper such as KITTI, under same max init error. E.g. The position XY 5m recall is only around 40% under 32 init error.

Hello, SiYlin, I've also been trying to migrate Orienternet to NuScenes recently. I wanted to ask you how you're getting GPS information. As far as I know, NuScenes only provides ground truth pose and doesn't include raw GPS information with noise. Thank you very much!

Hi : You can try to add some random noise on the ground truth pose to fake the scenario in which you have noise GPS information.

Got it, that's exactly what I'm doing now🤣, thanks.