No such file: data/cityscapes/gtFine/train/bremen/bremen_000149_000019_gtFine_labelTrainIds.png

zhjw0927 commented 2 years ago

I manage cityscapes data dir structure as Prepare data in readme. But, raise an error of not find the file. I find that file path has been changed in BaseDataset, it add "gtFine_labelTrainIds.png" into file path and this cause error to occur. What should I do to fix the bug? Thank you!

Haochen-Wang409 commented 2 years ago

Hi, how about data/cityscapes/gtFine/train/bremen/bremen_000149_000019_gtFine_color.png? Does this file exist?

zhjw0927 commented 2 years ago

Yes, it does exist

Haochen-Wang409 commented 2 years ago

Please refer to this link or this link, and convert your RGB mode labels to GTAY mode may helps.

zhjw0927 commented 2 years ago

Ok, thanks. I will try it first.

HeimingX commented 2 years ago

Please refer to this link or this link, and convert your RGB mode labels to GTAY mode may helps.

Hi, thanks for the guidance. I use the code to build up *_labelTrainIds.png files. But the supervised training with the released repo produces a really poor performance, e.g., 744 label, reported: 74.43, reproduced: 63.81.

About converting RGB mode labels to GTAY mode, I do not known where should I make a change. Do you mean change the img loader?

Looking forward your help, many thanks~

Haochen-Wang409 commented 2 years ago

If _labelTrainIds.png files are built, there is no need to change img_loader.

As for the poor performance for supervised baseline, did you use sh eval.sh to evaluate the performance? All results for Cityscapes should be evaluated in a slide window manner using eval.sh.

HeimingX commented 2 years ago

If _labelTrainIds.png files are built, there is no need to change img_loader.

As for the poor performance for supervised baseline, did you use sh eval.sh to evaluate the performance? All results for Cityscapes should be evaluated in a slide window manner using eval.sh.

Hi, thanks for the response. I have tried to run eval.sh and obtain 67.66 mIoU on 744 label setting which still heavily lags behind the reported number (74.43).

Since I never change the main parts of the supervised training and evaluation codes, the poor performance should attribute to the building up of *_labelTrainIds.png files. Currently, I use this code to create these files. Could you please help me to have a double check if this is a correct solution for dataset preparation? Or do you have any suggestions for such a bad result on the supervised baseline? Thanks a lot

Haochen-Wang409 commented 2 years ago

How about downloading labels here?

HeimingX commented 2 years ago

How about downloading labels here?

Hi, thanks a lot. I will have a try and give you a response ASAP.

HeimingX commented 2 years ago

Hi, I have using the released gt label to rerun the supervised training on cityscapes and the obtained results are as follows

label num	1/16 (186)	1/8 (372)	1/4 (744)
reported	65.74	72.53	74.43
reproduced	64.04	68.12	73.97

it seems that the released gt label solves the problem mentioned above, but there is still a performance margin to the reported one especially for the 1/8 setting (the results are obtained by running eval.sh). I wonder if it is because the optimization randomness or anything else? Look forward your reply, thanks a lot.

Haochen-Wang409 commented 2 years ago

Could you provide the config.yaml when training under 1/8 partition protocol?

HeimingX commented 2 years ago

Hi, sorry for the late response, the config used for 1/8 setting is listed below (data_root is changed to adapt to local path)

dataset: # Required.
  type: cityscapes
  train:
    data_root: semantic_seg/cityscapes
    data_list: ../../../../data/splits/cityscapes/372/labeled.txt
    flip: True
    GaussianBlur: False
    rand_resize: [0.5, 2.0]
    #rand_rotation: [-10.0, 10.0]
    crop:
      type: rand
      size: [769, 769] # crop image with HxW size
  val:
    data_root: semantic_seg/cityscapes
    data_list: ../../../../data/splits/cityscapes/val.txt
    crop:
      type: center
      size: [769, 769] # crop image with HxW size
  batch_size: 4
  n_sup: 372
  noise_std: 0.1
  workers: 2
  mean: [123.675, 116.28, 103.53]
  std: [58.395, 57.12, 57.375]
  ignore_label: 255

trainer: # Required.
  epochs: 200
  start_epochs: 0
  eval_on: True
  optimizer:
    type: SGD
    kwargs:
      lr: 0.01  # 4GPUs
      momentum: 0.9
      weight_decay: 0.0005
  lr_scheduler:
    mode: poly
    kwargs:
      power: 0.9

saver:
  main_dir: output/semi_seg/U2PL/city/372/sup_bs4_ep200
  auto_resume: False #True
  snapshot_dir: checkpoints
  pretrain: ''

criterion:
  type: ohem
  kwargs:
    thresh: 0.7
    min_kept: 100000

net: # Required.
  num_classes: 19
  sync_bn: True  # !!! debug !!!
  ema_decay: 0.99
  aux_loss:
    aux_plane: 1024
    loss_weight: 0.4
  encoder:
    type: u2pl.models.resnet.resnet101
    kwargs:
      multi_grid: True
      zero_init_residual: True
      fpn: True
      replace_stride_with_dilation: [False, True, True]  #layer0...1 is fixed, layer2...4
  decoder:
    type: u2pl.models.decoder.dec_deeplabv3_plus
    kwargs:
      rep_head: False
      inner_planes: 256
      dilations: [12, 24, 36]

HeimingX commented 2 years ago

I further run the U2PL on cityscapes and the results are as follows (all results are obtained with eval.sh)

label num	1/16 (186)	1/8 (372)	1/4 (744)	1/2 (1488)
reported	70.30	74.37	76.47	79.05
reproduced	70.52	72.35	75.79	78.61

It seems that the result on 1/8 setting is also worse than the reported one and the corresponding config is as follows

dataset: # Required.
  type: cityscapes_semi
  train:
    # data_root: ../../../../data/cityscapes
    data_root: semantic_seg/cityscapes
    data_list: ../../../../data/splits/cityscapes/372/labeled.txt
    flip: True
    GaussianBlur: False
    rand_resize: [0.5, 2.0]
    #rand_rotation: [-10.0, 10.0]
    crop:
      type: rand
      size: [769, 769] # crop image with HxW size
  val:
    # data_root: ../../../../data/cityscapes
    data_root: semantic_seg/cityscapes
    data_list: ../../../../data/splits/cityscapes/val.txt
    crop:
      type: center
      size: [769, 769] # crop image with HxW size
  batch_size: 2  # 1 for debug
  n_sup: 372
  noise_std: 0.1
  workers: 2
  mean: [123.675, 116.28, 103.53]
  std: [58.395, 57.12, 57.375]
  ignore_label: 255

trainer: # Required.
  epochs: 200
  eval_on: True
  sup_only_epoch: 0
  optimizer:
    type: SGD
    kwargs:
      lr: 0.01  # 8GPUs
      momentum: 0.9
      weight_decay: 0.0005
  lr_scheduler:
    mode: poly
    kwargs:
      power: 0.9
  unsupervised:
    TTA: False
    drop_percent: 80
    apply_aug: cutmix
  contrastive:
    negative_high_entropy: True
    low_rank: 3
    high_rank: 20
    current_class_threshold: 0.3
    current_class_negative_threshold: 1
    unsupervised_entropy_ignore: 80
    low_entropy_threshold: 20
    num_negatives: 50
    num_queries: 256
    temperature: 0.5

saver:
  main_dir: output/semi_seg/U2PL/city/372/u2pl_bs2_ep200
  auto_resume: True
  snapshot_dir: checkpoints
  pretrain: ''

criterion:
  type: ohem
  kwargs:
    thresh: 0.7
    min_kept: 100000

net: # Required.
  num_classes: 19
  sync_bn: True
  ema_decay: 0.99
  aux_loss:
    aux_plane: 1024
    loss_weight: 0.4
  encoder:
    type: u2pl.models.resnet.resnet101
    kwargs:
      multi_grid: True
      zero_init_residual: True
      fpn: True
      replace_stride_with_dilation: [False, True, True]  #layer0...1 is fixed, layer2...4
  decoder:
    type: u2pl.models.decoder.dec_deeplabv3_plus
    kwargs:
      inner_planes: 256
      dilations: [12, 24, 36]

Haochen-Wang409 commented 2 years ago

Have you reproduce the results for 1/8 spilit under a fully-supervised manner? I notice that your latest question was about semi-supervised settings. Maybe you could try different random seeds since the config.yaml seems to be correct.

By the way, what kind of GPU, and versions of CUDA and PyTorch you used for training?

HeimingX commented 2 years ago

Hi, the result of 1/8 split under the fully supervised setting has been presented in the past post.

The results of semi-supervised setting are posted because I just found the poor performance also happend on the 1/8 setting on U2PL.

The results are obtained under the following env

GPU: Tesla V100(32G) CUDA: 10.2 PyTorch: 1.7.1

Haochen-Wang409 commented 2 years ago

Hi, we fix a bug in lr for SupOnly manner. Please fetch the latest code and try it again~

Haochen-Wang409 commented 2 years ago

Hi, @HeimingX We have re-trained our SupOnly baseline for 1/8 partition protocol on Cityscapes after fixing the bug. The performance achieves 72.73 after eval.sh.

HeimingX commented 2 years ago

Hi, thanks for the prompt response.

I also re-run the SupOnly on 1/8 partition on cityscapes (lr bug is fixed) and obtain 69.89 after eval.sh, it is so strange...

Haochen-Wang409 commented 2 years ago

How about trying using Pytorch 1.8.1 with CUDA 11.2? I have no idea about why you got such poor performance..........

Haochen-Wang409 commented 2 years ago

Hi, @HeimingX Here is the log of training SupOnly baseline under 1/8 partition. Hope it will help you~

HeimingX commented 2 years ago

Hi Haochen, thanks for the nice help and I will have a check.

Haochen-Wang409 commented 2 years ago

Hi, @HeimingX Maybe you could try using GTs here, since dataset might be the only difference.

HeimingX commented 2 years ago

Thanks, I will have a try.

Cheers

Haochen-Wang409 commented 2 years ago

Hi @HeimingX We find that we have uploaded the wrong split for 1/8 Cityscapes, please fetch the latest code and try it again.

HeimingX commented 2 years ago

Hi, @Haochen-Wang409 Thanks for the update, I will have a try.

Haochen-Wang409 / U2PL

No such file: data/cityscapes/gtFine/train/bremen/bremen_000149_000019_gtFine_labelTrainIds.png #13