Yzichen / FlashOCC

Apache License 2.0
290 stars 31 forks source link

Unable to reproduce experimental results #16

Closed harrylin-hyl closed 8 months ago

harrylin-hyl commented 8 months ago

We run the flashocc-r50.py and flashocc-stbase-4d-stereo-512x1408.py configs, but can not get the reported results:

  1. flashocc-r50.py reported:

    ===> per class IoU of 6019 samples:

    ===> others - IoU = 6.74

    ===> barrier - IoU = 37.65

    ===> bicycle - IoU = 10.26

    ===> bus - IoU = 39.55

    ===> car - IoU = 44.36

    ===> construction_vehicle - IoU = 14.88

    ===> motorcycle - IoU = 13.4

    ===> pedestrian - IoU = 15.79

    ===> traffic_cone - IoU = 15.38

    ===> trailer - IoU = 27.44

    ===> truck - IoU = 31.73

    ===> driveable_surface - IoU = 78.82

    ===> other_flat - IoU = 37.98

    ===> sidewalk - IoU = 48.7

    ===> terrain - IoU = 52.5

    ===> manmade - IoU = 37.89

    ===> vegetation - IoU = 32.24

    ===> mIoU of 6019 samples: 32.08

    re-implement

    ===> per class IoU of 6019 samples:

    ===> others - IoU = 4.76

    ===> barrier - IoU = 32.72

    ===> bicycle - IoU = 10.02

    ===> bus - IoU = 32.77

    ===> car - IoU = 41.14

    ===> construction_vehicle - IoU = 14.91

    ===> motorcycle - IoU = 13.73

    ===> pedestrian - IoU = 15.38

    ===> traffic_cone - IoU = 15.02

    ===> trailer - IoU = 26.15

    ===> truck - IoU = 28.94

    ===> driveable_surface - IoU = 76.61

    ===> other_flat - IoU = 34.25

    ===> sidewalk - IoU = 43.99

    ===> terrain - IoU = 48.82

    ===> manmade - IoU = 33.38

    ===> vegetation - IoU = 30.17

    ===> mIoU of 6019 samples: 29.57

  2. flashocc-stbase-4d-stereo-512x1408.py reported:

    ===> per class IoU of 6019 samples:

    ===> others - IoU = 13.42

    ===> barrier - IoU = 51.07

    ===> bicycle - IoU = 27.68

    ===> bus - IoU = 51.57

    ===> car - IoU = 56.22

    ===> construction_vehicle - IoU = 27.27

    ===> motorcycle - IoU = 29.98

    ===> pedestrian - IoU = 29.93

    ===> traffic_cone - IoU = 29.8

    ===> trailer - IoU = 37.77

    ===> truck - IoU = 43.52

    ===> driveable_surface - IoU = 83.81

    ===> other_flat - IoU = 46.55

    ===> sidewalk - IoU = 56.15

    ===> terrain - IoU = 59.56

    ===> manmade - IoU = 50.84

    ===> vegetation - IoU = 44.67

    ===> mIoU of 6019 samples: 43.52

re-implement

===> per class IoU of 6019 samples:

===> others - IoU = 11.87

===> barrier - IoU = 48.89

===> bicycle - IoU = 28.64

===> bus - IoU = 50.12

===> car - IoU = 54.11

===> construction_vehicle - IoU = 24.95

===> motorcycle - IoU = 29.44

===> pedestrian - IoU = 28.22

===> traffic_cone - IoU = 27.04

===> trailer - IoU = 34.54

===> truck - IoU = 41.29

===> driveable_surface - IoU = 82.67

===> other_flat - IoU = 43.11

===> sidewalk - IoU = 54.65

===> terrain - IoU = 58.16

===> manmade - IoU = 49.85

===> vegetation - IoU = 43.37

===> mIoU of 6019 samples: 41.82

Are there parameter settings error? We have not made any modifications.

Yzichen commented 8 months ago

Can I see your training log?

harrylin-hyl commented 8 months ago

20240220_175227.log 20240220_215704.log

Yzichen commented 8 months ago

We use 4 gpus, if you use 8 gpus you need to adjust the learning rate .

harrylin-hyl commented 8 months ago

Could you please state the settings of learning rates and the batch sizes?

harrylin-hyl commented 8 months ago

When I use 8 * 8 batchsizes and lr 1e-4, I get the following results. There is 1.2 mIoU gap.

===> per class IoU of 6019 samples:

===> others - IoU = 5.12

===> barrier - IoU = 37.84

===> bicycle - IoU = 9.57

===> bus - IoU = 34.89

===> car - IoU = 42.85

===> construction_vehicle - IoU = 17.0

===> motorcycle - IoU = 15.75

===> pedestrian - IoU = 16.6

===> traffic_cone - IoU = 15.27

===> trailer - IoU = 27.82

===> truck - IoU = 29.56

===> driveable_surface - IoU = 76.21

===> other_flat - IoU = 35.84

===> sidewalk - IoU = 44.37

===> terrain - IoU = 48.79

===> manmade - IoU = 35.06

===> vegetation - IoU = 29.58

===> mIoU of 6019 samples: 30.71

synsin0 commented 7 months ago

I train FlashOcc-R50 with 4 A6000 GPUS and samples_per_gpu=4,it is aligned with the provided log. But My best result is mIoU=30.93, which is 1.1 lower than the given checkpoint. I wonder how the 32.08 may be reproduced. Thanks!

drilistbox commented 7 months ago

We recheck the code and provide the detailed training commend at https://github.com/Yzichen/FlashOCC/blob/master/doc/cmd.md:

bash tool/dist_train.sh projects/configs/flashocc/flashocc-r50-M0.py 4 # 31.95 bash tool/dist_train.sh projects/configs/flashocc/flashocc-r50.py 4 # 32.08 bash tool/dist_train.sh projects/configs/flashocc/flashocc-r50-4d-stereo.py 4 # 37.84 bash tool/dist_train.sh projects/configs/flashocc/flashocc-stbase-4d-stereo-512x1408_4x4_1e-4.py 4 # 41.80 bash tool/dist_train.sh projects/configs/flashocc/flashocc-stbase-4d-stereo-512x1408_4x4_2e-4.py 4 # 43.52

and lr = 1e-4.

Beside, do you evaluate your model with epoch_24_ema.pth?

synsin0 commented 7 months ago

I evaluate with epoch_24_ema and gets mIoU=32.24. Thanks for your solution!

AlphaPlusTT commented 3 months ago

@drilistbox So the results provided here or in the config files (e.g. 31.95 in projects/configs/flashocc/flashocc-r50-M0.py) come from ema_pth, right?

drilistbox commented 3 months ago

All provided results come from ema_pth