Question about the implementation between the two codebase (D2 & Pytorch)

Hello bowen, thank you for sharing this awesome work. I have a question about the code implementation. Is there any difference except for the dataset (data augmentation ) between this two codebase? I am trying using the old codebase to train a model on COCO dataset with the parameters used in D2. However, I got a very bad result. Could you please give me some advices (use updated 960 MAX_RESIZE_VALUE)? Thank you!

This is my configs. Some configs are modified, please ignore the args with "PRE". Those are the implementation of Vip-deeplab and unactivated when trying on COCO. For other parts, they are consist to your configs in old-version. `MODEL: META_ARCHITECTURE: "panoptic_deeplab" BN_MOMENTUM: 0.01 BACKBONE: META: "resnet" NAME: "resnet50" DILATION: (False, False, False) PRETRAINED: True DECODER: IN_CHANNELS: 2048 FEATURE_KEY: "res5" INSTANCE_IN_CHANNELS: 2048 INSTANCE_FEATURE_KEY: "res5" PRE_IN_CHANNELS: 4096 PRE_FEATURE_KEY: "res5_new" DECODER_CHANNELS: 256 ATROUS_RATES: (3, 6, 9) PANOPTIC_DEEPLAB: LOW_LEVEL_CHANNELS: (1024, 512, 256) LOW_LEVEL_KEY: ["res4", "res3", "res2"] LOW_LEVEL_CHANNELS_PROJECT: (128, 64, 32) AUX: False TCF: False ADD: True TCF_SIMPLE: False INSTANCE: ENABLE: True LOW_LEVEL_CHANNELS: (1024, 512, 256) LOW_LEVEL_KEY: [ "res4", "res3", "res2" ] LOW_LEVEL_CHANNELS_PROJECT: (128, 64, 32) DECODER_CHANNELS: 128 HEAD_CHANNELS: 32 ASPP_CHANNELS: 256 NUM_CLASSES: (1, 2) CLASS_KEY: ["center", "offset"] PRE_BRANCH: False PRE: LOW_LEVEL_CHANNELS: (2048, 512, 256) LOW_LEVEL_KEY: [ "res4_new", "res3", "res2" ] LOW_LEVEL_CHANNELS_PROJECT: (128, 64, 32) DECODER_CHANNELS: 128 HEAD_CHANNELS: 32 ASPP_CHANNELS: 256 NUM_CLASSES: (2,) CLASS_KEY: ["offset_pre"]

DATALOADER: TRAIN_SHUFFLE: True DATASET: ROOT: "/xiangli/coco/" DATASET: "coco_panoptic" NUM_CLASSES: 133 TRAIN_SPLIT: 'train2017' TEST_SPLIT: 'val2017' CROP_SIZE: (640,640) MIRROR: True MIN_SCALE: 0.5 MAX_SCALE: 1.5 SCALE_STEP_SIZE: 0.25 MEAN: (0.485, 0.456, 0.406) STD: (0.229, 0.224, 0.225) SEMANTIC_ONLY: False IGNORE_STUFF_IN_OFFSET: True SMALL_INSTANCE_AREA: 4096 SMALL_INSTANCE_WEIGHT: 3 MIN_RESIZE_VALUE: 640 MAX_RESIZE_VALUE: 960 RESIZE_FACTOR: 32 SOLVER: BASE_LR: 0.0005 WEIGHT_DECAY: 0.0 WEIGHT_DECAY_NORM: 0.0 BIAS_LR_FACTOR: 1.0 WEIGHT_DECAY_BIAS: 0.0 OPTIMIZER: "adam" LR_SCHEDULER_NAME: "WarmupPolyLR" WARMUP_ITERS: 0 LOSS: SEMANTIC: NAME: "cross_entropy" IGNORE: 255 TOP_K_PERCENT: 0.2 WEIGHT: 1.0 CENTER: NAME: "mse" WEIGHT: 200.0 OFFSET: NAME: "l1" WEIGHT: 0.01 OFFSET_PRE: NAME: "l1" WEIGHT: 0.01 TRAIN: IMS_PER_BATCH: 48 MAX_ITER: 200000 DEBUG: DEBUG: True DEBUG_FREQ: 100 TARGET_KEYS: ('semantic', 'center', 'offset') OUTPUT_KEYS: ('semantic', 'center', 'offset') TEST: CROP_SIZE: (960,960) DEBUG: True EVAL_INSTANCE: True EVAL_PANOPTIC: True POST_PROCESSING: CENTER_THRESHOLD: 0.1 NMS_KERNEL: 7 TOP_K_INSTANCE: 200 STUFF_AREA: 2048 OUTPUT_DIR: "./output/coco_pretrain" GPUS: (0, 1, 2, 3, 4, 5, 6, 7) VAL_FREQ: 20000 WORKERS: 1`

bowenc0221 / panoptic-deeplab

Question about the implementation between the two codebase (D2 & Pytorch) #70