hustvl / WeakTr

WeakTr: Exploring Plain Vision Transformer for Weakly-supervised Semantic Segmentation
MIT License
122 stars 2 forks source link

推理 ReadMe 中 checkpoint 未能达到表中性能 #17

Closed HAL-42 closed 1 year ago

HAL-42 commented 1 year ago

作者你好! 我下载了ReadMe中,VOC val上mIoU为78.4的checkpoint,并按照evaluation 文档中步骤,先后做 Multi-scale Evaluation 和 CRF post-processing,命令与程序输出如下: Multi-scale Evaluation:73.0% image CRF post-processing:74.0% image 所得结果与ReadMe中不符。

注:由于ReadMe中没有提供variant.yml,我按照train 文档生成了一份vairant.yml,并修改了类别数以匹配checkpoint的参数shape。

algorithm_kwargs:
  batch_size: 4
  eval_freq: 1
  num_epochs: 100
  start_epoch: 0
amp: false
clip_kwargs:
  patch_size: 120
  start_value: 1.2
dataset_kwargs:
  ann_dir: WeakTr_CAMlb_wCRF
  batch_size: 4
  crop_size: 480
  dataset: pascal_voc
  image_size: 520
  max_ratio: null
  normalization: deit
  num_workers: 2
  split: train
gradientclipping: true
inference_kwargs:
  im_size: 520
  window_size: 480
  window_stride: 320
layer_decay: 1.0
log_dir: exp_wt/WeakTr_CAMlb_wCRF
net_kwargs:
  backbone: deit_small_patch16_224
  d_model: 384
  decoder:
    drop_path_rate: 0.0
    dropout: 0.1
    n_cls: 60  # 21或60,不影响结果。
    n_layers: 2
    name: mask_transformer
  distilled: false
  drop_path_rate: 0.1
  dropout: 0.0
  image_size: !!python/tuple
  - 480
  - 480
  n_cls: 60  # 从21改为60,否则无法加载模型。
  n_heads: 6
  n_layers: 12
  normalization: deit
  patch_size: 16
optimizer_kwargs:
  clip_grad: null
  enc_lr: 0.1
  epochs: 100
  iter_max: 264600
  iter_warmup: 0
  lr: 0.0001
  min_lr: 1.0e-05
  momentum: 0.9
  opt: sgd
  poly_power: 0.9
  poly_step_size: 1
  sched: polynomial
  weight_decay: 0.0
resume: false
version: normal
world_batch_size: 4

麻烦作者帮我想想可能的原因,感激不尽(◍•ᴗ•◍)

Yingyue-L commented 1 year ago

The variant.yml file is as below:

algorithm_kwargs:
  batch_size: 4
  eval_freq: 1
  num_epochs: 100
  start_epoch: 0
amp: false
clip_kwargs:
  patch_size: 120
  start_value: 1.2
dataset_kwargs:
  ann_dir: WeakTr_CAMlb_wCRF
  batch_size: 4
  crop_size: 480
  dataset: pascal_voc
  image_size: 520
  max_ratio: null
  normalization: vit
  num_workers: 2
  split: train
gradientclipping: true
inference_kwargs:
  im_size: 520
  window_size: 480
  window_stride: 320
layer_decay: 1.0
log_dir: start1.2_patch120_seg_vit_small_patch16_384_voc_weaktr
net_kwargs:
  backbone: vit_small_patch16_384
  d_model: 384
  decoder:
    drop_path_rate: 0.0
    dropout: 0.1
    n_cls: 21
    n_layers: 2
    name: mask_transformer
  distilled: false
  drop_path_rate: 0.1
  dropout: 0.0
  image_size: !!python/tuple
  - 480
  - 480
  n_cls: 60
  n_heads: 6
  n_layers: 12
  normalization: vit
  patch_size: 16
optimizer_kwargs:
  clip_grad: null
  enc_lr: 0.1
  epochs: 100
  iter_max: 264600
  iter_warmup: 0
  lr: 0.0001
  min_lr: 1.0e-05
  momentum: 0.9
  opt: sgd
  poly_power: 0.9
  poly_step_size: 1
  sched: polynomial
  weight_decay: 0.0
resume: false
version: normal
world_batch_size: 4

I'm sorry there is something wrong for the checkpoint WeakTr_OnlineRetraining_ViT.pth, I just update it. Please download again for reproduce the result.

HAL-42 commented 1 year ago

试过了,的确可以表中性能,谢谢!