bowenc0221 / panoptic-deeplab

This is Pytorch re-implementation of our CVPR 2020 paper "Panoptic-DeepLab: A Simple, Strong, and Fast Baseline for Bottom-Up Panoptic Segmentation" (https://arxiv.org/abs/1911.10194)
Apache License 2.0
587 stars 117 forks source link

Question about the implementation between the two codebase (D2 & Pytorch) #70

Closed lxa9867 closed 3 years ago

lxa9867 commented 3 years ago

Hello bowen, thank you for sharing this awesome work. I have a question about the code implementation. Is there any difference except for the dataset (data augmentation ) between this two codebase? I am trying using the old codebase to train a model on COCO dataset with the parameters used in D2. However, I got a very bad result. Could you please give me some advices (use updated 960 MAX_RESIZE_VALUE)? Thank you!

This is my configs. Some configs are modified, please ignore the args with "PRE". Those are the implementation of Vip-deeplab and unactivated when trying on COCO. For other parts, they are consist to your configs in old-version. `MODEL: META_ARCHITECTURE: "panoptic_deeplab" BN_MOMENTUM: 0.01 BACKBONE: META: "resnet" NAME: "resnet50" DILATION: (False, False, False) PRETRAINED: True DECODER: IN_CHANNELS: 2048 FEATURE_KEY: "res5" INSTANCE_IN_CHANNELS: 2048 INSTANCE_FEATURE_KEY: "res5" PRE_IN_CHANNELS: 4096 PRE_FEATURE_KEY: "res5_new" DECODER_CHANNELS: 256 ATROUS_RATES: (3, 6, 9) PANOPTIC_DEEPLAB: LOW_LEVEL_CHANNELS: (1024, 512, 256) LOW_LEVEL_KEY: ["res4", "res3", "res2"] LOW_LEVEL_CHANNELS_PROJECT: (128, 64, 32) AUX: False TCF: False ADD: True TCF_SIMPLE: False INSTANCE: ENABLE: True LOW_LEVEL_CHANNELS: (1024, 512, 256) LOW_LEVEL_KEY: [ "res4", "res3", "res2" ] LOW_LEVEL_CHANNELS_PROJECT: (128, 64, 32) DECODER_CHANNELS: 128 HEAD_CHANNELS: 32 ASPP_CHANNELS: 256 NUM_CLASSES: (1, 2) CLASS_KEY: ["center", "offset"] PRE_BRANCH: False PRE: LOW_LEVEL_CHANNELS: (2048, 512, 256) LOW_LEVEL_KEY: [ "res4_new", "res3", "res2" ] LOW_LEVEL_CHANNELS_PROJECT: (128, 64, 32) DECODER_CHANNELS: 128 HEAD_CHANNELS: 32 ASPP_CHANNELS: 256 NUM_CLASSES: (2,) CLASS_KEY: ["offset_pre"]

DATALOADER: TRAIN_SHUFFLE: True DATASET: ROOT: "/xiangli/coco/" DATASET: "coco_panoptic" NUM_CLASSES: 133 TRAIN_SPLIT: 'train2017' TEST_SPLIT: 'val2017' CROP_SIZE: (640,640) MIRROR: True MIN_SCALE: 0.5 MAX_SCALE: 1.5 SCALE_STEP_SIZE: 0.25 MEAN: (0.485, 0.456, 0.406) STD: (0.229, 0.224, 0.225) SEMANTIC_ONLY: False IGNORE_STUFF_IN_OFFSET: True SMALL_INSTANCE_AREA: 4096 SMALL_INSTANCE_WEIGHT: 3 MIN_RESIZE_VALUE: 640 MAX_RESIZE_VALUE: 960 RESIZE_FACTOR: 32 SOLVER: BASE_LR: 0.0005 WEIGHT_DECAY: 0.0 WEIGHT_DECAY_NORM: 0.0 BIAS_LR_FACTOR: 1.0 WEIGHT_DECAY_BIAS: 0.0 OPTIMIZER: "adam" LR_SCHEDULER_NAME: "WarmupPolyLR" WARMUP_ITERS: 0 LOSS: SEMANTIC: NAME: "cross_entropy" IGNORE: 255 TOP_K_PERCENT: 0.2 WEIGHT: 1.0 CENTER: NAME: "mse" WEIGHT: 200.0 OFFSET: NAME: "l1" WEIGHT: 0.01 OFFSET_PRE: NAME: "l1" WEIGHT: 0.01 TRAIN: IMS_PER_BATCH: 48 MAX_ITER: 200000 DEBUG: DEBUG: True DEBUG_FREQ: 100 TARGET_KEYS: ('semantic', 'center', 'offset') OUTPUT_KEYS: ('semantic', 'center', 'offset') TEST: CROP_SIZE: (960,960) DEBUG: True EVAL_INSTANCE: True EVAL_PANOPTIC: True POST_PROCESSING: CENTER_THRESHOLD: 0.1 NMS_KERNEL: 7 TOP_K_INSTANCE: 200 STUFF_AREA: 2048 OUTPUT_DIR: "./output/coco_pretrain" GPUS: (0, 1, 2, 3, 4, 5, 6, 7) VAL_FREQ: 20000 WORKERS: 1`

bowenc0221 commented 3 years ago

Please check the config from log file mentioned in this issue. However, I haven't tried hard to reproduce COCO results with the original code, so I would recommend building your project on the D2 codebase.

One difference is that the D2 version uses even image crop size but for the original Pytorch code base it uses odd image crop size following TensorFlow DeepLab.