NVIDIA / semantic-segmentation

Nvidia Semantic Segmentation monorepo
BSD 3-Clause "New" or "Revised" License
1.76k stars 388 forks source link

Training SOTA details on Mapillary #165

Open huangqiuyu opened 2 years ago

huangqiuyu commented 2 years ago

Hello, thanks for your great work! I tried to reproduce the results on the Mapillary val set that mIoU is 61.05. But I got 60.08 on val set there is still a 1.0 gap with your model results. My train_mapillary.yml is:

CMD: "python -m torch.distributed.launch --nproc_per_node=8 train.py"

HPARAMS: [
  {
   dataset: mapillary,
   cv: 0,
   result_dir: LOGDIR,

   pre_size: 2177,
   crop_size: "1024,1856",
   syncbn: true,
   apex: true,
   fp16: true,
   gblur: true,

   bs_trn: 1,

   lr_schedule: poly,
   poly_exp: 1.0,
   optimizer: sgd,
   lr: 2e-2,
   max_epoch: 200,
   rmi_loss: true,

   arch: ocrnet.HRNet_Mscale,
   n_scales: '0.25,0.5,1.0,2.0',
  }
]

And I train this model across 4 nodes that 8GPUs per node. All these settings are the same as mentioned in your paper.

Could you provide some other training details? Thanks!

ajtao commented 2 years ago

Hello @huangqiuyu ,

It sounds like you're very close. I'm sharing the per-class evaluation for our best model below. Note that we achieve this only with multi-scale eval with scales of 0.25, 0.5, 1.0, 2.0. It would probably make sense to compare your per-class IOUs to ours to see where the difference is. There may be a few unstable classes that differ, which may result in the overall difference.

fast-rattlesnake-eval

huangqiuyu commented 2 years ago

This is my results below. It's look like a lot of differents with you provide, there are 6.0+ gaps in some classes. I can't find the reason of these differences, can you give me your yml file about training best model?

image

ajtao commented 2 years ago

image