About Phase 2 reproducibility

likelion-hyeonjun commented 1 year ago

I've successfully reproduced results for solov2-resnet50 and Mask2Former-swinsmall. However, I'm currently training a model using the configuration file solov2_r101_dcn_fpn_3x_coco_mal.py from the mmdet/config/MALMask folder, and the performance is not promising.

Could you please confirm if the training configuration in mmdet/configs/MALMASK/solov2_r101_dcn_fpn_3x_coco_mal.py is accurate?

And here is the config file that I use (using GT-annotation (instances_train2017.json does not even show promising result)

"solov2_r101_dcn_fpn_3x_coco_mal.py"

base = '../GTMask/solov2_r50_fpn_3x_coco.py'

model = dict( backbone=dict( depth=101, init_cfg=dict(checkpoint='torchvision://resnet101'), dcn=dict(type='DCNv2', deformable_groups=1, fallback_on_stride=False), stage_with_dcn=(False, True, True, True)), mask_head=dict( mask_feature_head=dict(conv_cfg=dict(type='DCNv2')), dcn_cfg=dict(type='DCNv2'), dcn_apply_to_all_conv=True))

lr_config = dict( policy='step', warmup='linear', warmup_iters=2000, warmup_ratio=1.0 / 10, step=[27, 33])

data=dict( train=dict(ann_file='/data/coco/annotations/instances_train2017.json'), val=dict(ann_file='/data/coco/annotations/instances_val2017.json' ), test=dict( ann_file='data/coco/annotations/image_info_test-dev2017.json', img_prefix='data/coco/test2017/'))

voidrank commented 1 year ago

Hi @likelion-hyeonjun

Could you show the results that are not promising? or any log?

likelion-hyeonjun commented 1 year ago

The log below is from the training process using phase 2 with pseudo mask produced by MAL (SOLOv2 ResNet 101 DCN).

Evaluate annotation type *segm*
784 DONE (t=23.82s).
785 Accumulating evaluation results...
786 DONE (t=4.59s).
787 2023-10-28 04:21:31,184 - mmdet - INFO -
788  Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.001
789  Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=1000 ] = 0.002
790  Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=1000 ] = 0.000
791  Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.000
792  Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.000
793  Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.001
794  Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.007
795  Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=300 ] = 0.007
796  Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=1000 ] = 0.007
797  Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = 0.000
798  Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = 0.000
799  Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.015
800 2023-10-28 04:21:31,643 - mmdet - INFO - Exp name: solov2_r101_dcn_fpn_3x_coco_mal.py
801 2023-10-28 04:21:31,644 - mmdet - INFO - Epoch(val) [4][625] bbox_mAP: 0.0000, bbox_mAP_50: 0.0000, bbox_mAP_75: 0.0000, bbox_mAP_s: 0.0000, bbox_mAP_m: 0.0000, bbox_mAP_l: 0.0000, bbox_mAP_copypaste: 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000, segm_mAP: 0.0008, segm_mAP_50: 0.0021, segm_mAP_75: 0.0004, segm_mAP_s: 0.0000, segm_mAP_m: 0.0000, segm_mAP_l: 0.0012, segm_mAP_copypaste: 0.0008 0.0021 0.0004 0.0000 0.0000 0.0012

Despite reaching epoch 4, the segmentation mAP is only around 0.001. Although the model hasn't been trained until the 36th epoch, the trend does not seem promising. For comparison, using the SOLOv2 ResNet 50 model yields an mAP of around 0.17 (17 mAP) even just after one epoch.

Additionally, I've verified that the provided configuration for the SoloV2 ResNeXT 101 DCN model reproduces well. It appears that there might be an issue specifically with the ResNet 101 DCN configuration.

voidrank commented 1 year ago

Maybe you can try to increase the number of warmup iter. That sometimes helps.

likelion-hyeonjun commented 1 year ago

Thank you for the advice. I will proceed with experiments by adjusting the warmup iterations

likelion-hyeonjun commented 1 year ago

Additionally, I have a question regarding the LVIS dataset.

The Phase 2 LVIS reproduce configuration files you provided are mask_rcnn_x101_32x4d_fpn_sample1e-3_mstrain_1x_lvis_v1_mal.pyand mask_rcnn_r101_fpn_sample1e-3_mstrain_1x_lvis_v1_mal.py. It appears that both of these use mask_rcnn_r50_fpn_sample1e-3_mstrain_1x_lvis_v1.py as their base file. However, I could not find the content corresponding to this base file in the code you've uploaded, making it difficult to reproduce the results. Would it be possible for you to also upload the mask_rcnn_r50_fpn_sample1e-3_mstrain_1x_lvis_v1.py file?

I'm sorry for bothering you, and thank you for taking the time to reply.

voidrank commented 1 year ago

Sorry, I didn't save the original file. You might just look at mmdetection r50 config.

NVlabs / mask-auto-labeler

About Phase 2 reproducibility #19