Unable to replicate results

triple-tam commented 3 years ago

Hello Tianwei, I am unable to replicate the detection results (and hence tracking results) on Nuscenes dataset v1.0-mini. I run this command

/tools/dist_test.py ./configs/nusc/voxelnet/nusc_centerpoint_voxelnet_0075voxel_dcn_flip.py --work_dir /path/ --checkpoint /path/ --speed_test

and expect mAP 59.5 and NDS 67.4 approximately. However, I get mAP 42.2 and NDS 50.5. I understand that the train/val split is different and hence some skew is to be expected, but this is a rather large difference. Is there an error I am making, or could you share additional insight? Thank you!

Full output log included below:

2021-02-16 10:07:11.373270: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0
No Tensorflow
2021-02-16 10:07:43,115 - INFO - Distributed testing: False
2021-02-16 10:07:43,115 - INFO - torch.backends.cudnn.benchmark: False
2021-02-16 10:07:43,721 - INFO - Finish RPN Initialization
2021-02-16 10:07:43,721 - INFO - num_classes: [1, 2, 2, 1, 2, 2]
Use HM Bias:  -2.19
Use Deformable Convolution in the CenterHead!
2021-02-16 10:07:43,850 - INFO - Finish CenterHead Initialization
Use Val Set
10
2021-02-16 10:09:10,169 - INFO - work dir: work_dirs/nusc_0075_dcn_flip_tam
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 81/81, 0.4 task/s, elapsed: 191s, ETA:     0s
 Total time per frame:  2.007547237254955
======
Loading NuScenes tables for version v1.0-mini...
23 category,
8 attribute,
4 visibility,
911 instance,
12 sensor,
120 calibrated_sensor,
31206 ego_pose,
8 log,
10 scene,
404 sample,
31206 sample_data,
18538 sample_annotation,
4 map,
Done loading in 3.6 seconds.
======
Reverse indexing ...
Done reverse indexing in 0.1 seconds.
======
Finish generate predictions for testset, save to work_dirs/nusc_0075_dcn_flip_tam/infos_val_10sweeps_withvelo_filter_True.json
Initializing nuScenes detection evaluation
Loaded results from work_dirs/nusc_0075_dcn_flip_tam/infos_val_10sweeps_withvelo_filter_True.json. Found detections for 81 samples.
Loading annotations for mini_val split from nuScenes version: v1.0-mini
100%|██████████████████████████████████████████| 81/81 [00:00<00:00, 375.76it/s]
Loaded ground truth annotations for 81 samples.
Filtering predictions
=> Original number of boxes: 13497
=> After distance based filtering: 11015
=> After LIDAR points based filtering: 11015
=> After bike rack filtering: 10951
Filtering ground truth annotations
=> Original number of boxes: 4441
=> After distance based filtering: 3785
=> After LIDAR points based filtering: 3393
=> After bike rack filtering: 3393
Rendering sample token b6c420c3a5bd4a219b1cb82ee5ea0aa7
Rendering sample token b22fa0b3c34f47b6a360b60f35d5d567
Rendering sample token d8251bbc2105497ab8ec80827d4429aa
Rendering sample token 372725a4b00e49c78d6d0b1c4a38b6e0
Rendering sample token ce94ef7a0522468e81c0e2b3a2f1e12d
Rendering sample token 0d0700a2284e477db876c3ee1d864668
Rendering sample token 61a7bd24f88a46c2963280d8b13ac675
Rendering sample token fa65a298c01f44e7a182bbf9e5fe3697
Rendering sample token 8573a885a7cb41d185c05029eeb9a54e
Rendering sample token 38a28a3aaf2647f2a8c0e90e31267bf8
Accumulating metric data...
Calculating metrics...
Rendering PR and TP curves
Saving metrics to: work_dirs/nusc_0075_dcn_flip_tam
mAP: 0.4219
mATE: 0.4324
mASE: 0.4403
mAOE: 0.4813
mAVE: 0.3997
mAAE: 0.3070
NDS: 0.5049
Eval time: 4.0s

Per-class results:
Object Class    AP  ATE ASE AOE AVE AAE
car 0.627   0.193   0.159   0.140   0.126   0.083
truck   0.590   0.145   0.147   0.105   0.073   0.004
bus 0.985   0.198   0.130   0.022   0.435   0.246
trailer 0.000   1.000   1.000   1.000   1.000   1.000
construction_vehicle    0.000   1.000   1.000   1.000   1.000   1.000
pedestrian  0.712   0.179   0.246   0.305   0.208   0.123
motorcycle  0.615   0.289   0.260   0.395   0.056   0.000
bicycle 0.364   0.222   0.182   0.364   0.299   0.000
traffic_cone    0.326   0.097   0.279   nan nan nan
barrier 0.000   1.000   1.000   1.000   nan nan
Evaluation nusc: Nusc v1.0-mini Evaluation
car Nusc dist AP@0.5, 1.0, 2.0, 4.0
54.39, 61.88, 63.84, 70.70 mean AP: 0.6270105648056109
truck Nusc dist AP@0.5, 1.0, 2.0, 4.0
57.14, 59.25, 59.51, 59.95 mean AP: 0.5896504511543077
construction_vehicle Nusc dist AP@0.5, 1.0, 2.0, 4.0
0.00, 0.00, 0.00, 0.00 mean AP: 0.0
bus Nusc dist AP@0.5, 1.0, 2.0, 4.0
98.54, 98.54, 98.54, 98.54 mean AP: 0.9854423344235611
trailer Nusc dist AP@0.5, 1.0, 2.0, 4.0
0.00, 0.00, 0.00, 0.00 mean AP: 0.0
barrier Nusc dist AP@0.5, 1.0, 2.0, 4.0
0.00, 0.00, 0.00, 0.00 mean AP: 0.0
motorcycle Nusc dist AP@0.5, 1.0, 2.0, 4.0
50.50, 58.02, 63.45, 74.16 mean AP: 0.6153361293850497
bicycle Nusc dist AP@0.5, 1.0, 2.0, 4.0
33.72, 37.27, 37.27, 37.44 mean AP: 0.36423471250997785
pedestrian Nusc dist AP@0.5, 1.0, 2.0, 4.0
61.62, 67.54, 74.61, 81.04 mean AP: 0.7120199979990003
traffic_cone Nusc dist AP@0.5, 1.0, 2.0, 4.0
31.62, 32.70, 32.70, 33.31 mean AP: 0.3258054576587581

tianweiy commented 3 years ago

everything including the logs and prediction files is included in the links. Please try the true validation set and let me know if you still can't reproduce the results. I don't have the mini subset to test at the moment

tianweiy commented 3 years ago

trailer 0.000 1.000 1.000 1.000 1.000 1.000 construction_vehicle 0.000 1.000 1.000 1.000 1.000 1.000 barrier 0.000 1.000 1.000 1.000 nan nan

It seems that there is no trailer/cv / barrier in this mini subset so that their map and nds are zero. I think the performance is quite reasonable

triple-tam commented 3 years ago

It is not possible for me to download the whole dataset (300 GB), but I appreciate your insight! Thank you

siddharthKatageri commented 2 years ago

Hi @triple-tam & @tianweiy I am not able to get the evaluation done for Nuscenes dataset v1.0-mini. Even I run the same command as @triple-tam i.e.

python ./tools/dist_test.py ./configs/nusc/voxelnet/nusc_centerpoint_voxelnet_0075voxel_dcn_flip.py --work_dir /scratch/sidd/output/nusc_centerpoint_voxelnet_0075voxel_dcn_flip/ --checkpoint /scratch/sidd/checkpoint/voxel_dcn_flip.pth --speed_test

The error says there is no val split for the v1.0-mini version. But in the output of @triple-tam, I see that ground truth is being loaded from mini_val split... To load from mini_val split, did you change anything in the code?? If Yes, can you please tell me where should I make changes to get the evaluation done for mini_val split..

Here is the output and the error which I get

no apex
No Tensorflow
2022-04-19 16:47:36,388 - INFO - Distributed testing: False
2022-04-19 16:47:36,388 - INFO - torch.backends.cudnn.benchmark: False
2022-04-19 16:47:36,466 - INFO - Finish RPN Initialization
2022-04-19 16:47:36,467 - INFO - num_classes: [1, 2, 2, 1, 2, 2]
Use HM Bias:  -2.19
Use Deformable Convolution in the CenterHead!
2022-04-19 16:47:36,518 - INFO - Finish CenterHead Initialization
Use Val Set
10
2022-04-19 16:47:38,699 - INFO - work dir: /scratch/sidd/output/nusc_centerpoint_voxelnet_0075voxel_dcn_flip/
[>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 81/81, 2.1 task/s, elapsed: 39s, ETA:     0s
 Total time per frame:  0.3941149005183467
======
Loading NuScenes tables for version v1.0-mini...
23 category,
8 attribute,
4 visibility,
911 instance,
12 sensor,
120 calibrated_sensor,
31206 ego_pose,
8 log,
10 scene,
404 sample,
31206 sample_data,
18538 sample_annotation,
4 map,
Done loading in 0.9 seconds.
======
Reverse indexing ...
Done reverse indexing in 0.1 seconds.
======
Finish generate predictions for testset, save to /scratch/sidd/output/nusc_centerpoint_voxelnet_0075voxel_dcn_flip/infos_val_10sweeps_withvelo_filter_True.json
Initializing nuScenes detection evaluation
Loaded results from /scratch/sidd/output/nusc_centerpoint_voxelnet_0075voxel_dcn_flip/infos_val_10sweeps_withvelo_filter_True.json. Found detections for 81 samples.
Loading annotations for val split from nuScenes version: v1.0-mini
Traceback (most recent call last):
  File "./tools/dist_test.py", line 211, in <module>
    main()
  File "./tools/dist_test.py", line 201, in main
    result_dict, _ = dataset.evaluation(copy.deepcopy(predictions), output_dir=args.work_dir, testset=args.testset)
  File "/home2/siddharth/fresh/CenterPoint/det3d/datasets/nuscenes/nuscenes.py", line 296, in evaluation
    output_dir,
  File "/home2/siddharth/fresh/CenterPoint/det3d/datasets/nuscenes/nusc_common.py", line 620, in eval_main
    verbose=True,
  File "/home2/siddharth/CenterPoint/nuscenes-devkit/python-sdk/nuscenes/eval/detection/evaluate.py", line 82, in __init__
    self.gt_boxes = load_gt(self.nusc, self.eval_set, DetectionBox, verbose=verbose)
  File "/home2/siddharth/CenterPoint/nuscenes-devkit/python-sdk/nuscenes/eval/common/loaders.py", line 80, in load_gt
    'Error: Requested split {} which is not compatible with NuScenes version {}'.format(eval_split, version)
AssertionError: Error: Requested split val which is not compatible with NuScenes version v1.0-mini

tianweiy / CenterPoint

Unable to replicate results #87