Open ZhengyuXia opened 1 year ago
@ZhengyuXia I am
[08/02 08:49:23 d2.engine.defaults]: Evaluation results for ade20k_sem_seg_val in csv format:
[08/02 08:49:23 d2.evaluation.testing]: copypaste: Task: sem_seg
[08/02 08:49:23 d2.evaluation.testing]: copypaste: mIoU,fwIoU,mACC,pACC
[08/02 08:49:23 d2.evaluation.testing]: copypaste: 45.5368,70.6117,59.3918,81.6061
I have run it three times, and the results are all similar.
@hhaAndroid
I rollback the python version from 3.8 to 3.7, and the performance increased by ~0.4% mIoU. I also enabled the "SyncBN" in the config file and it gives additional ~0.5% mIoU improvement. So far, my best reproduction result is 47.6%, but it is still lower than the paper's result by ~1%.
ade_48.7log.txt
Hi, above is my log file for the 48.7
results for your reference. ADE20K is a small dataset, and the performance may not be so stable. I will also check the code to see if something gets wrong.
@FengLi-ust
Thanks for uploading the log file. I roughly checked the settings in this file, and found several difference.
The NORM setting in the log file is FrozenBN
RESNETS:
DEFORM_MODULATED: False
DEFORM_NUM_GROUPS: 1
DEFORM_ON_PER_STAGE: [False, False, False, False]
DEPTH: 50
NORM: FrozenBN
But it is disabled in the given yaml file
RESNETS:
DEPTH: 50
STEM_TYPE: "basic" # not used
STEM_OUT_CHANNELS: 64
STRIDE_IN_1X1: False
OUT_FEATURES: ["res2", "res3", "res4", "res5"]
# NORM: "SyncBN"
The CLASS_WEIGHT is 2.0 and DEC layer is 10 in the log file
MASK_FORMER:
BOX_LOSS: True
BOX_WEIGHT: 5.0
CLASS_WEIGHT: 2.0
DEC_LAYERS: 10
But the CLASS_WEIGHT is 4.0 and DEC layer is 9 in the yaml file
MaskDINO:
TRANSFORMER_DECODER_NAME: "MaskDINODecoder"
DEEP_SUPERVISION: True
NO_OBJECT_WEIGHT: 0.1
CLASS_WEIGHT: 4.0
MASK_WEIGHT: 5.0
DICE_WEIGHT: 5.0
HIDDEN_DIM: 256
NUM_OBJECT_QUERIES: 100
NHEADS: 8
DROPOUT: 0.0
DIM_FEEDFORWARD: 2048
ENC_LAYERS: 0
PRE_NORM: False
ENFORCE_INPUT_PROJ: False
SIZE_DIVISIBILITY: 32
DEC_LAYERS: 9 # 9 decoder layers, add one for the loss on learnable query
I tried to use all of or some of these settings, but the best performance is ~47.1% mIoU, still lower by ~1.6%
Hi, thanks for the excellent work. I'm trying to reproduce the semantic segmentation results (ResNet-50 backbone + ADE20K). However, the performance is 46.6%, which is much lower than yours by 2.1%. I've conducted the experiments three times, all the performance were around 46.6%.
The config file that I used is _configs/ade20k/semantic-segmentation/maskdino_R50_bs16_160ksteplr.yaml, which indicates the training iteration is 160K, same as mentioned in the paper. However, the model we can download here is _maskdino_r50_50ep_100q_celoss_hid1024_3s_semantic_ade20k48.7miou.pth, which indicates the training epoch is 50. It seems the training config files are different between them. Therefore, I'm wondering if there was something I missed when training this model?