Difference in pre-processing the front-view image

wudongming97 / TopoMLP

[ICLR2024] TopoMLP: A Simple yet Strong Pipeline for Driving Topology Reasoning

Apache License 2.0

131 stars 11 forks source link

Difference in pre-processing the front-view image #16

Closed Wolfybox closed 3 months ago

Wolfybox commented 4 months ago

Hi, thanks for open-sourcing the code!

I notice that TopoMLP is using a different pre-processing strategy compared to TopoNet.

TopoNet crops the front-view image to 1550 (h) x 1550 (w) and then scales all images by half, resulting in 775 x 775 for the front-view and 775 x 1024 for the others. TopoMLP directly resizes the front-view from 2048 x 1550 to 1550 x 2048 to align to ther others.

Is there any theorectial or experimental support for such changes in pre-processing?

Intuitively, it seems to me that the latter which causes distortion to front-view image should lead to larger potential performance loss.

for reference: https://github.com/OpenDriveLab/TopoNet/issues/1 or see description in A.2 Training Details/Input of the TopoNet paper.

wudongming97 commented 3 months ago

Hi @Wolfybox,

Yes. Our code has a different pre-processing pipeline compared to TopoNet. I would like to claim the reason for using it.

Our code sources from Openlane-V2 baseline https://github.com/OpenDriveLab/OpenLane-V2/blob/centerline/plugin/mmdet3d/configs/baseline.py#L112, where it directly resizes the front image.
We notice this choice may influence the object size, further leading to potential performance degradation. Thus, we designed a pre-processing pipeline to crop and resize the front image as you described. Surprisingly, its score is quite low and we gave up this solution.
As TopoNet is a concurrent work, we don't try its pre-processing pipeline. You can try it by yourself.

Wolfybox commented 3 months ago

Hi @Wolfybox,

Yes. Our code has a different pre-precessing pipeline compared to TopoNet. I would like to claim the reason for using it.

Our code sources from Openlane-V2 baseline https://github.com/OpenDriveLab/OpenLane-V2/blob/centerline/plugin/mmdet3d/configs/baseline.py#L112, where it directly resizes the front image.

We notice this choice may influence the object size, further leading to potential performance degradation. Thus, we designed a pre-precessing pipeline to crop and resize the front image as you described. Surprisingly, its score is quite low and we gave up this solution.

As TopoNet is a concurrent work, we don't try its pre-precessing pipeline. You can try it by yourself.

Thanks for your explanation, though it is kinda counterintuitive that the crop-and-resize strategy would lead to a lower score. BTW I found from diferent models that there are more than 3 distinct ways of pre-processing the front-view image.