chaytonmin / Occupancy-MAE

Official implementation of our TIV'23 paper: Occupancy-MAE: Self-supervised Pre-training Large-scale LiDAR Point Clouds with Masked Occupancy Autoencoders
Apache License 2.0
251 stars 18 forks source link

The reproduced results on the Waymo dataset #20

Open WWW2323 opened 1 year ago

WWW2323 commented 1 year ago

Hi, dear author, there is about 0.5~1.0 gap between the results I reproduced and the results the paper reports on waymo dataset, it's variance or there is something wrong with my reproducing? Do you pre-train Voxel-MAE with 100% waymo training data? I pre-train Voxel-MAE with this config for 30 epochs and use the 30th epoch to initialize CenterPoint. After fine-tuning for 30epoch, I get the following results: image And the results reported on paper is:

image
chaytonmin commented 1 year ago

Hi, dear author, there is about 0.5~1.0 gap between the results I reproduced and the results the paper reports on waymo dataset, it's variance or there is something wrong with my reproducing? Do you pre-train Voxel-MAE with 100% waymo training data? I pre-train Voxel-MAE with this config for 30 epochs and use the 30th epoch to initialize CenterPoint. After fine-tuning for 30epoch, I get the following results: image And the results reported on paper is: image

We pre-train Voxel-MAE with ~20% data of training samples as OpenPCDet.

WWW2323 commented 1 year ago

@chaytonmin Hi, dear author, thanks for your quick reply, and is my result normal? How many epochs you pre-trian Voxel-MAE? 3 or 20 or 30 epochs? The pre-train epoch in the code is 30, while the pre-train epoch mentioned in the paper is 3, is it a mistake? 3381671101036_ pic

chaytonmin commented 1 year ago

@chaytonmin Hi, dear author, thanks for your quick reply, and is my result normal? How many epochs you pre-trian Voxel-MAE? 3 or 20 or 30 epochs? The pre-train epoch in the code is 30, while the pre-train epoch mentioned in the paper is 3, is it a mistake? 3381671101036_ pic

The result is normal. The environmental changes can cause minor changes in results. Voxel-MAE converges in 3 epochs on KITTI dataset. Therefore, we just pre-train for 3 epochs. With the limited time, we didn't do a lot of experiments. Maybe more epochs are more suitable.