HRNet / HRNet-Semantic-Segmentation

The OCR approach is rephrased as Segmentation Transformer: https://arxiv.org/abs/1909.11065. This is an official implementation of semantic segmentation for HRNet. https://arxiv.org/abs/1908.07919
Other
3.12k stars 685 forks source link

OCR Train and Val on Cityscapes #113

Open dreamPoet opened 4 years ago

dreamPoet commented 4 years ago

Hi,

I have trained OCR on cityscapes many times with different hyperparameters, however, I still cannot get wanted mIoU which is 81.60% (and the highest one I got was 81.54%), I have changed batch size, GPUs, epochs, LR, etc. Are there any other tips to improve the mIoU to the released one?

Thx.

PkuRainBow commented 4 years ago

Sorry for the inconvenience.

First, your reproduced performance of 81.54% is already comparable with our previous best result (based on this code base) on the Cityscapes validation set. The performance variance on the Cityscapes is beyond the scope of this work and we would like to encourage you to try our HRNet + OCR on the other benchmarks.

Besides, the performance of HRNet + OCR (reported in our OCR paper) is also lower than 81.6% (on the Citsycapes validation set)as we used to conduct all of the experiments based on our another code base openseg.pytorch. The reported performance based on this codebase is a little higher mainly due to more training iterations and larger batch size.

Hoping our comments could help you!

sde123 commented 4 years ago

@PkuRainBow thank you very much for your work, but i have a problem that the NUM_CLASSES: 19 in experiment/seg_hrnet_w48_train_512x1024_sgd_lr1e-2_wd5e-4_bs_12_epoch484.yaml. However, the total number of class in Cityscapes dataset is 34, could you please tell me why it is 19 in your code?

PkuRainBow commented 4 years ago

Please check Section 2.2 the original Cityscapes paper and it is an official (default) scheme to only consider 19 out of the 34 categories for the semantic segmentation task. Here we paste the original description in the Section 2.2 as below,

We defined 30 visual classes for annotation, which are grouped into eight categories: flat, construction, nature, vehicle, sky, object, human, and void. Classes were selected based on their frequency, relevance from an application standpoint, practical considerations regarding the annotation effort, as well as to facilitate compatibility with existing datasets, e.g. [7, 19, 75]. Classes that are too rare are excluded from our benchmark, leaving 19 classes for evaluation, see Fig. 1 for details. We plan to release our annotation tool upon publication of the dataset.

dreamPoet commented 4 years ago

Thx. BYW, I find during the training there is no log file created. Are there any codes I can uncomment directly to produce log files with tensorboardX?

Margrate commented 2 years ago

Hi,

I have trained OCR on cityscapes many times with different hyperparameters, however, I still cannot get wanted mIoU which is 81.60% (and the highest one I got was 81.54%), I have changed batch size, GPUs, epochs, LR, etc. Are there any other tips to improve the mIoU to the released one?

Thx.

Could you share your training config of mIoU81.54%?

PkuRainBow commented 2 years ago

@Margrate Please try the "HRNet + OCR + RMI" (https://github.com/openseg-group/openseg.pytorch/blob/pytorch-1.7/MODEL_ZOO.md#cityscapes) and you are expected to achieve around 82.6%.