linyq2117 / CLIP-ES

MIT License
175 stars 9 forks source link

The DeepLab config #7

Closed Big-Brother-Pikachu closed 1 year ago

Big-Brother-Pikachu commented 1 year ago

Hi, thanks for sharing this wonderful work! I wonder if you could provide the .yaml file like https://github.com/CVI-SZU/CLIMS/blob/segmentation/deeplabv2/configs/voc12_coco_pretrained.yaml? I have problems reproducing the results in this table, especially for the imagenet-pretrained one. Looking forward to your reply!

linyq2117 commented 1 year ago

Hi, thanks for your interest,

The config we used is similar to that in deeplab-pytorch, and the difference is stated in our paper (Appendix D). Specifically, you can set LR: 2e-4 for imagenet-pretrained model and LR: 2.5e-5 for coco-pretrained model on VOC dataset. Note that there is a certain randomness in this deeplab-pytorch repo, so the result may have a slight fluctuation. You can run it for multiple times to find a better result.

Big-Brother-Pikachu commented 1 year ago

Hi, thanks for your quick reply!

I do follow the instructions in Appendix E. We get 73.5 with the coco-pretrained model. But only get 69.7 with the imagenet-pretrained model. And we only get NaN loss if we don't add the balanced cross-entropy loss for imagenet-pretrained model.

I think coco results are good now, but the imagenet results still have some gaps. Our config for imagenet is as follows:

EXP: ID: voc12_imagenet_from_clipes OUTPUT_DIR: data

DATASET: NAME: vocaug ROOT: ./datasets/VOCdevkit LABELS: ./data/datasets/voc12/labels.txt N_CLASSES: 21 IGNORE_LABEL: 255 SCALES: [0.5, 0.75, 1.0, 1.25, 1.5] SPLIT: TRAIN: train_clipes VAL: val TEST: test

DATALOADER: NUM_WORKERS: 0

IMAGE: MEAN: R: 122.675 G: 116.669 B: 104.008 SIZE: BASE: # None TRAIN: 321 TEST: 513

MODEL: NAME: DeepLabV2_ResNet101_MSC N_BLOCKS: [3, 4, 23, 3] ATROUS_RATES: [6, 12, 18, 24] INIT_MODEL: data/models/imagenet/deeplabv1_resnet101/caffemodel/deeplabv1_resnet101-imagenet.pth

SOLVER: BATCH_SIZE: TRAIN: 5 TEST: 1 ITER_MAX: 30000 ITER_SIZE: 2 ITER_SAVE: 5000 ITER_TB: 100 LR_DECAY: 10 LR: 2e-4 MOMENTUM: 0.9 OPTIMIZER: sgd POLY_POWER: 0.9 WEIGHT_DECAY: 5.0e-4 AVERAGE_LOSS: 20

CRF: ITER_MAX: 10 POS_W: 3 POS_XY_STD: 1 BI_W: 4 BI_XY_STD: 67 BI_RGB_STD: 3

Could you point out where may our problem be? Thank you!

linyq2117 commented 1 year ago

As mentioned before, the result of deeplab-pytorch is not stable and sometimes even results in NaN loss. A simple retry can solve it. Btw, the focus of our method is to generate high-quality pseudo segmentation masks. The training of segmentation models is just to validate their quality and thus can be combined with any existing segmentation networks. You can try other repos if not satisfied with deeplab-pytorch.

Big-Brother-Pikachu commented 1 year ago

OK, got it. Thanks for your patient reply! And I close this issue now.

HwiJeong-Lee commented 1 year ago

@Big-Brother-Pikachu Hi, I'm now trying to reproduce voc12 result with the coco-pretrained model, but I fail to reach 73.5 mIoU as you. I tried several times but I can only get about 71 mIoU.

Since I can get the same performance of CAM result with the paper which is in Table 1, I think I have some problems implementing deeplab v2 although I I did follow the instructions in Appendix E.

So, I wonder if you could provide the coco-pretrained voc12.yaml. Also, I would be appreciate it if you let me know you needed any other method for reproducing.

I'll looking forward to your reply!

Big-Brother-Pikachu commented 1 year ago

@HwiJeong-Lee Hi, our config for coco is as follows:

EXP: ID: voc12_coco_from_clipes OUTPUT_DIR: data

DATASET: NAME: vocaug ROOT: ../stable-diffusion/datasets/VOCdevkit LABELS: ./data/datasets/voc12/labels.txt N_CLASSES: 21 IGNORE_LABEL: 255 SCALES: [0.5, 0.75, 1.0, 1.25, 1.5] SPLIT: TRAIN: train_clipes VAL: val TEST: test

DATALOADER: NUM_WORKERS: 0

IMAGE: MEAN: R: 122.675 G: 116.669 B: 104.008 SIZE: BASE: # None TRAIN: 321 TEST: 513

MODEL: NAME: DeepLabV2_ResNet101_MSC N_BLOCKS: [3, 4, 23, 3] ATROUS_RATES: [6, 12, 18, 24] INIT_MODEL: data/models/coco/deeplabv1_resnet101/caffemodel/deeplabv1_resnet101-coco.pth

SOLVER: BATCH_SIZE: TRAIN: 5 TEST: 1 ITER_MAX: 20000 ITER_SIZE: 2 ITER_SAVE: 5000 ITER_TB: 100 LR_DECAY: 10 LR: 2.5e-5 MOMENTUM: 0.9 OPTIMIZER: sgd POLY_POWER: 0.9 WEIGHT_DECAY: 5.0e-4 AVERAGE_LOSS: 20

CRF: ITER_MAX: 10 POS_W: 3 POS_XY_STD: 1 BI_W: 4 BI_XY_STD: 67 BI_RGB_STD: 3

We did not change the main.py for the coco-pretrained model but used the balanced cross entropy loss for the imagenet-pretrained model.

HwiJeong-Lee commented 1 year ago

@Big-Brother-Pikachu Thanks for your reply!