nutonomy / nuscenes-devkit

The devkit of the nuScenes dataset.
https://www.nuScenes.org
Other
2.25k stars 624 forks source link

Filter Ground Truth Annotations by Front Camera / Run Evaluation on Front Camera #724

Closed abhi1kumar closed 2 years ago

abhi1kumar commented 2 years ago

Hi nuScenes Team, nuScenes uses data from all six cameras for evaluation. However, I want to run the evaluation of nuScenes for the front camera.

I use the following command for evaluation with the oracle labels on val split

python /home/abhinav/project/nuscenes-devkit/python-sdk/nuscenes/eval/detection/evaluate.py \ 
--version v1.0-trainval \ 
--eval_set val \
--plot_examples 0 \
--render_curves 0 \
--result_path ~/project/output/run_1100/oracle_label/submission.json 

Here are the results:

mAP: 0.1700                                                                                                                                                                                                  
mATE: 0.0039
mASE: 0.0159
mAOE: 0.0025
mAVE: 1.5243
mAAE: 1.0000
NDS: 0.3828
Eval time: 17.3s
Per-class results:
Object Class    AP  ATE ASE AOE AVE AAE
         car    0.178   0.004   0.016   0.003   2.775   1.000
       truck    0.200   0.004   0.009   0.003   1.827   1.000
         bus    0.322   0.004   0.009   0.003   2.953   1.000
     trailer    0.211   0.004   0.007   0.003   0.531   1.000
construction    0.122   0.004   0.011   0.002   0.105   1.000
  pedestrian    0.111   0.004   0.020   0.003   0.860   1.000
  motorcycle    0.233   0.004   0.017   0.003   2.283   1.000
     bicycle    0.089   0.004   0.026   0.003   0.861   1.000
traffic_cone    0.089   0.004   0.026   nan nan nan
     barrier    0.144   0.004   0.016   0.003   nan nan

Thus, the above command reports an AP of one-sixth while the ideal AP should have been 1 with the oracle labels of six cameras. Therefore, I need to filter out the boxes which do not show up in the front camera, and keep the boxes which show up in the front camera.

I digged into the devkit and found that the evaluate function uses data from all camera images:

sample = nusc.get('sample', sample_token)
sample_annotation_tokens = sample['anns']

I did not know how should we filter sample_annotation_tokens by front camera. So, I looked into the tutorial as well, but the following code from the tutorial does not give me annotations

sensor = 'CAM_FRONT'
cam_front_data = nusc.get('sample_data', my_sample['data'][sensor])
cam_front_data

Therefore, it would be awesome if you could help me filter/get ground truths for the front camera and thereby, help me run the evaluation only on the front camera images in nuScenes.

abhi1kumar commented 2 years ago

I followed up @holger-motional answer and added the following between L114 and L115 to load_gt function of loaders.py

_, boxes, _ = nusc.get_sample_data(sample['data']["CAM_FRONT"], selected_anntokens= [sample_annotation_token])
# no boxes found with the annotation token, which means this box does not belong to this camera
if len(boxes) <= 0:
    continue 

The evaluation code after running with oracle ground truths reports AP close to 1 but not exactly 1.

mAP: 0.9605
mATE: 0.0039
mASE: 0.0160
mAOE: 0.0026
mAVE: 1.5362
mAAE: 1.0000
NDS: 0.7780
Eval time: 24.8s

Per-class results:
Object Class    AP  ATE ASE AOE AVE AAE
         car    0.964   0.004   0.017   0.003   2.711   1.000
       truck    0.918   0.004   0.009   0.003   1.829   1.000
         bus    0.870   0.004   0.010   0.003   3.023   1.000
     trailer    0.981   0.004   0.007   0.003   0.540   1.000
construction    0.964   0.004   0.011   0.002   0.085   1.000
  pedestrian    0.976   0.004   0.020   0.003   0.854   1.000
  motorcycle    0.989   0.004   0.017   0.003   2.285   1.000
     bicycle    0.989   0.004   0.026   0.002   0.961   1.000
traffic_cone    0.978   0.004   0.027   nan nan nan
     barrier    0.975   0.004   0.016   0.003   nan nan

Would you mind telling me why are APs not exactly one?

whyekit-motional commented 2 years ago

@abhi1kumar when you use the ground truths as the "predictions", you would also need to change this line: https://github.com/nutonomy/nuscenes-devkit/blob/f94dcd313feb8adc91e7d01312fb7d27cc77098e/python-sdk/nuscenes/eval/detection/evaluate.py#L80

The line should read:

self.pred_boxes, self.meta = load_gt(self.nusc, self.eval_set, DetectionBox, verbose=verbose), dict()

If you simply load the full ground truth (i.e. across all cameras) as "predictions", the sample annotations from other cameras might count as false positives, hence making AP < 1

Alternatively, if you do not wish to modify the code, you can write a method that filters ~/project/output/run_1100/oracle_label/submission.json for entries that correspond only to sample annotations that are present in the CAM_FRONT