question about specifications of evaluation

tusen-ai / SST

Code for a series of work in LiDAR perception, including SST (CVPR 22), FSD (NeurIPS 22), FSD++ (TPAMI 23), FSDv2, and CTRL (ICCV 23, oral).

Apache License 2.0

789 stars 100 forks source link

question about specifications of evaluation #43

Closed seonhoon1002 closed 2 years ago

seonhoon1002 commented 2 years ago

I had an exciting time to travel your codes the last few weeks, and I finally got the below results.

but I have a question about Waymo evaluation code.

How could I check the configuration of these metrics such as "Does it use bev or 3D, does it use @11 or @40 for mAP"

Because in KITTI, there is an option for choosing 'bev' or '3D', but I can't find it in waymo, except "compute_detection_metrics_main" created by binary file.

Summary: How could I check configurations of metrics on Waymo evaluation.

Abyssaledge commented 2 years ago

Thanks for your using. compute_detection_metrics_main calculates 3D IoU@(0.7/0.5/0.5) for Vehicle/Pedestrain/Cyclist. There is no concepts like R11 or R40 in Waymo. Different from the way in KITTI or COCO, Waymo uses score cutoff to sample data points on the PR curve. Specifically, Waymo samples a data point every 0.01 score from the PR curve (0.99, 0.98, ..., 0.01, 0). So it is more like a "R100" strategy. BTW, which config did you use to get the results above? The results seem a little strange.

seonhoon1002 commented 2 years ago

"There is no concepts like R11 or R40 in Waymo. Different from the way in KITTI or COCO, Waymo uses score cutoff to sample data points on the PR curve. Specifically, Waymo samples a data point every 0.01 score from the PR curve (0.99, 0.98, ..., 0.01, 0). So it is more like a "R100" strategy. " --> Thank you for answer.

BTW, which config did you use to get the results above? The results seem a little strange. --> I test "configs>sst>sst_waymoD1_2x_3class_8heads_1f" which little modify your "configs>sst>sst_waymoD1_2x_3class_8heads_3f". And could you tell me what's wrong with the above results? if you are weird about form, it copies from "onenote" which I clean up my results. So It could be a little different from console form.

Abyssaledge commented 2 years ago

In your results, the performance of vehicle and pedestrian is a little bit lower (~1AP) while the cyclist has a higher number (~1AP). I believe your results are basically correctly, but I recommend to use the center version in sst_refactor which is simpler and better.

seonhoon1002 commented 2 years ago

I want to check your code on the same setting in your paper, so I use the old version. But I agree with your advice. Next time I will use the latest version. Thanks.

Abyssaledge commented 2 years ago

In fact, the code we used in our paper is not totally consistent with the code here. The performances are slightly different. For example, follow this repo, you may get 1 AP lower in Vehicle and 2 AP higher in Pedestrian. @seonhoon1002

seonhoon1002 commented 2 years ago

So if you don't mind, could you tell me more detail about the differences between the code and your paper? @Abyssaledge

Abyssaledge commented 2 years ago

I am going to update the old config in couple of days.