[Reimplementation] Mask2former result

david95j2 commented 1 year ago

Prerequisite

[X] I have searched Issues and Discussions but cannot get the expected help.
[X] I have read the FAQ documentation but cannot get the expected help.
[X] The bug has not been fixed in the latest version (master) or latest version (3.x).

💬 Describe the reimplementation questions

An error occurred during training using mask2former_swin-s-p4-w7-224_lsj_8x2_50e_mask-panoptic.py with single class. please tell me if I miss anything

Instructions To Reproduce the Issue:

_base_ = ['../mask2former/mask2former_swin-t-p4-w7-224_lsj_8x2_50e_coco-panoptic.py']
pretrained = 'https://github.com/SwinTransformer/storage/releases/download/v1.0.0/swin_small_patch4_window7_224.pth'  # noqa

num_things_classes = 1
num_stuff_classes = 0
num_classes = num_things_classes + num_stuff_classes

depths = [2, 2, 18, 2]
model = dict(
    backbone=dict(
        depths=depths, 
        init_cfg=dict(type='Pretrained',checkpoint=pretrained) # origin
    ),
    panoptic_head=dict(
        num_things_classes=num_things_classes,
        num_stuff_classes=num_stuff_classes,
        loss_cls=dict(class_weight=[1.0] * num_classes + [0.1]) 
    )
)

evaluation = dict(
    interval=1,
    metric=['PQ', 'bbox', 'segm'],
    dynamic_intervals=[(365001, 368750)])

dataset_type = 'CocoPanopticDataset'
data_root = 'data/crack/'

data = dict(
    samples_per_gpu=2,
    workers_per_gpu=2,
    train=dict(
        type='CocoPanopticDataset',
        ann_file=data_root+'annotations/panoptic_train2022.json',
        img_prefix=data_root+'train2022/',
        seg_prefix=data_root+'annotations/panoptic_train2022/'
    ),
    val=dict(
        type='CocoPanopticDataset',
        ann_file=data_root+'annotations/panoptic_val2022.json',
        img_prefix=data_root+'val2022/',
        seg_prefix=data_root+'annotations/panoptic_val2022/',
        ins_ann_file=data_root+'annotations/instances_val2022.json'
    ),
    test=dict(
        type='CocoPanopticDataset',
        ann_file=data_root+'annotations/panoptic_val2022.json',
        img_prefix=data_root+'val2022/',
        seg_prefix=data_root+'annotations/panoptic_val2022/',
        ins_ann_file=data_root+'annotations/instances_val2022.json'
    )
)

# set all layers in backbone to lr_mult=0.1
# set all norm layers, position_embeding,
# query_embeding, level_embeding to decay_multi=0.0
backbone_norm_multi = dict(lr_mult=0.1, decay_mult=0.0)
backbone_embed_multi = dict(lr_mult=0.1, decay_mult=0.0)
embed_multi = dict(lr_mult=1.0, decay_mult=0.0)
custom_keys = {
    'backbone': dict(lr_mult=0.1, decay_mult=1.0),
    'backbone.patch_embed.norm': backbone_norm_multi,
    'backbone.norm': backbone_norm_multi,
    'absolute_pos_embed': backbone_embed_multi,
    'relative_position_bias_table': backbone_embed_multi,
    'query_embed': embed_multi,
    'query_feat': embed_multi,
    'level_embed': embed_multi
}
custom_keys.update({
    f'backbone.stages.{stage_id}.blocks.{block_id}.norm': backbone_norm_multi
    for stage_id, num_blocks in enumerate(depths)
    for block_id in range(num_blocks)
})
custom_keys.update({
    f'backbone.stages.{stage_id}.downsample.norm': backbone_norm_multi
    for stage_id in range(len(depths) - 1)
})
# optimizer
optimizer = dict(
    paramwise_cfg=dict(custom_keys=custom_keys, norm_decay_mult=0.0))

Environment

sys.platform: linux
Python: 3.6.9 (default, Mar 10 2023, 16:46:00) [GCC 8.4.0]
CUDA available: True
GPU 0: Tesla V100-PCIE-32GB
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 10.1, V10.1.168
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 1.10.0+cu102
PyTorch compiling details: PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v2.2.3 (Git Hash 7336ca9f055cf1bfa13efb658fe15dc9b41f0740)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX512
  - CUDA Runtime 10.2
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70
  - CuDNN 7.6.5
  - Magma 2.5.2
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=10.2, CUDNN_VERSION=7.6.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.10.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, 

TorchVision: 0.11.0+cu102
OpenCV: 4.7.0
MMCV: 1.4.8
MMCV Compiler: GCC 7.3
MMCV CUDA Compiler: 10.2
MMDetection: 2.28.2+e9cae2d

Expected results

My Error log

2023-03-22 17:24:05,811 - mmdet - INFO - workflow: [('train', 5000)], max: 368750 iters
2023-03-22 17:24:05,812 - mmdet - INFO - Checkpoints will be saved to /mnt/4T_mnt/joo/temp_repo4/mmdetection/work_dirs/mask2former_swin-s-p4-w7-224_lsj_8x2_50e_coco-panoptic by HardDiskBackend.
[                                                  ] 0/4268, elapsed: 0s, ETA:Traceback (most recent call last):
  File "tools/train.py", line 250, in <module>
    main()
  File "tools/train.py", line 246, in main
    meta=meta)
  File "/mnt/4T_mnt/joo/temp_repo4/mmdetection/mask2former/lib/python3.6/site-packages/mmdet/apis/train.py", line 246, in train_detector
    runner.run(data_loaders, cfg.workflow)
  File "/mnt/4T_mnt/joo/temp_repo4/mmdetection/mask2former/lib/python3.6/site-packages/mmcv/runner/iter_based_runner.py", line 134, in run
    iter_runner(iter_loaders[i], **kwargs)
  File "/mnt/4T_mnt/joo/temp_repo4/mmdetection/mask2former/lib/python3.6/site-packages/mmcv/runner/iter_based_runner.py", line 67, in train
    self.call_hook('after_train_iter')
  File "/mnt/4T_mnt/joo/temp_repo4/mmdetection/mask2former/lib/python3.6/site-packages/mmcv/runner/base_runner.py", line 309, in call_hook
    getattr(hook, fn_name)(self)
  File "/mnt/4T_mnt/joo/temp_repo4/mmdetection/mask2former/lib/python3.6/site-packages/mmcv/runner/hooks/evaluation.py", line 262, in after_train_iter
    self._do_evaluate(runner)
  File "/mnt/4T_mnt/joo/temp_repo4/mmdetection/mask2former/lib/python3.6/site-packages/mmdet/core/evaluation/eval_hooks.py", line 60, in _do_evaluate
    results = single_gpu_test(runner.model, self.dataloader, show=False)
  File "/mnt/4T_mnt/joo/temp_repo4/mmdetection/mask2former/lib/python3.6/site-packages/mmdet/apis/test.py", line 29, in single_gpu_test
    result = model(return_loss=False, rescale=True, **data)
  File "/mnt/4T_mnt/joo/temp_repo4/mmdetection/mask2former/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/4T_mnt/joo/temp_repo4/mmdetection/mask2former/lib/python3.6/site-packages/mmcv/parallel/data_parallel.py", line 50, in forward
    return super().forward(*inputs, **kwargs)
  File "/mnt/4T_mnt/joo/temp_repo4/mmdetection/mask2former/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 166, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/mnt/4T_mnt/joo/temp_repo4/mmdetection/mask2former/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/mnt/4T_mnt/joo/temp_repo4/mmdetection/mask2former/lib/python3.6/site-packages/mmcv/runner/fp16_utils.py", line 109, in new_func
    return old_func(*args, **kwargs)
  File "/mnt/4T_mnt/joo/temp_repo4/mmdetection/mask2former/lib/python3.6/site-packages/mmdet/models/detectors/base.py", line 174, in forward
    return self.forward_test(img, img_metas, **kwargs)
  File "/mnt/4T_mnt/joo/temp_repo4/mmdetection/mask2former/lib/python3.6/site-packages/mmdet/models/detectors/base.py", line 147, in forward_test
    return self.simple_test(imgs[0], img_metas[0], **kwargs)
  File "/mnt/4T_mnt/joo/temp_repo4/mmdetection/mask2former/lib/python3.6/site-packages/mmdet/models/detectors/maskformer.py", line 173, in simple_test
    mask_results[label].append(mask)
IndexError: list index out of range

Additional information

No response

hhaAndroid commented 1 year ago

@david95j2 You need to set metainfo to the dataset. example https://github.com/open-mmlab/mmyolo/blob/main/configs/yolov5/yolov5_s-v61_fast_1xb12-40e_cat.py#L31

david95j2 commented 1 year ago

@hhaAndroid sorry... my mistake

The reason for the above problem is that I did not modify the num_things_classes, num_stuff_classes of panoptic_fusion_head in the config file.

num_things_classes = 1
num_stuff_classes = 0
num_classes = num_things_classes + num_stuff_classes

model = dict(
    backbone=dict(
        embed_dims=192,
        num_heads=[6, 12, 24, 48],
        init_cfg=dict(type='Pretrained', checkpoint=pretrained),
        # init_cfg=None
        ),
    panoptic_head=dict(
        num_queries=200,
        in_channels=[192, 384, 768, 1536],
        num_things_classes=num_things_classes,
        num_stuff_classes=num_stuff_classes,
        loss_cls=dict(class_weight=[1.0] * num_classes + [0.1])
        ),
    panoptic_fusion_head=dict(
        num_things_classes=num_things_classes,
        num_stuff_classes=num_stuff_classes)
    )

but i have new proplem...

Run method

python3 tools/train.py configs/custom/mask2former_swin-s-p4-w7-224_lsj_8x2_50e_coco-panoptic.py

Error Message

my result is empty. so the following message will be printed

mmdet/datasets/coco_panoptic.py

print(f"\n\nlen(results) : {len(results)}\n\n") # debug
result_files, tmp_dir = self.format_results(results, jsonfile_prefix)
print(f"result_files : {result_files}") # debug

> len(results) : 4268
> result_files : {}

Traceback (most recent call last):
  File "tools/train.py", line 250, in <module>
    main()
  File "tools/train.py", line 239, in main
    train_detector(
  File "/home/jylee/vsc/test/mmdetection/mmdet/apis/train.py", line 246, in train_detector
    runner.run(data_loaders, cfg.workflow)
  File "/home/jylee/vsc/test/mmdetection/seg/lib/python3.8/site-packages/mmcv/runner/iter_based_runner.py", line 144, in run
    iter_runner(iter_loaders[i], **kwargs)
  File "/home/jylee/vsc/test/mmdetection/seg/lib/python3.8/site-packages/mmcv/runner/iter_based_runner.py", line 70, in train
    self.call_hook('after_train_iter')
  File "/home/jylee/vsc/test/mmdetection/seg/lib/python3.8/site-packages/mmcv/runner/base_runner.py", line 317, in call_hook
    getattr(hook, fn_name)(self)
  File "/home/jylee/vsc/test/mmdetection/seg/lib/python3.8/site-packages/mmcv/runner/hooks/evaluation.py", line 266, in after_train_iter
    self._do_evaluate(runner)
  File "/home/jylee/vsc/test/mmdetection/mmdet/core/evaluation/eval_hooks.py", line 135, in _do_evaluate
    key_score = self.evaluate(runner, results)
  File "/home/jylee/vsc/test/mmdetection/seg/lib/python3.8/site-packages/mmcv/runner/hooks/evaluation.py", line 367, in evaluate
    eval_res = self.dataloader.dataset.evaluate(
  File "/home/jylee/vsc/test/mmdetection/mmdet/datasets/coco_panoptic.py", line 625, in evaluate
    eval_pan_results = self.evaluate_pan_json(
  File "/home/jylee/vsc/test/mmdetection/mmdet/datasets/coco_panoptic.py", line 530, in evaluate_pan_json
    pred_json = mmcv.load(result_files['panoptic'])
KeyError: 'panoptic'

Additional information

my results type is tuple..I don't know why the panoptic segmentation format sends out these results

print(f"results : {results}") # debug
result_files, tmp_dir = self.format_results(results, jsonfile_prefix)

> results : ([array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
            ......
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]], dtype=float32)],
  [[{'size': [462, 991], 'counts': b'bSo='}, 
    {'size': [462, 991], 'counts': b'bSo='},
    {'size': [462, 991], 'counts': b'bSo='},
             .......
    {'size': [462, 991], 'counts': b'bSo='},
    {'size': [462, 991], 'counts': b'bSo='}]])

Coco Dataset, not custom dataset, get the following results

{'pan_results': array([[133, 133, 133, ..., 133, 133, 133],
       [133, 133, 133, ..., 133, 133, 133],
       [133, 133, 133, ..., 133, 133, 133],
       ...,
       [133, 133, 133, ..., 133, 133, 133],
       [133, 133, 133, ..., 133, 133, 133],
       [133, 133, 133, ..., 133, 133, 133]], dtype=int32), 'ins_results': ([array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32),
      .....,
[], [], [], []])}

how to get the result format of coco dataset on my custom dataset...

david95j2 commented 1 year ago

How do I modify the config to get these results

{'pan_results': array([[133, 133, 133, ..., 133, 133, 133],
       [133, 133, 133, ..., 133, 133, 133],
       [133, 133, 133, ..., 133, 133, 133],
       ...,
       [133, 133, 133, ..., 133, 133, 133],
       [133, 133, 133, ..., 133, 133, 133],
       [133, 133, 133, ..., 133, 133, 133]], dtype=int32), 'ins_results': ([array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32),
      .....,
[], [], [], []])}

HarshitSheoran commented 2 months ago

Hi @david95j2

I am encountering the exact same problem, I am using class CocoPanopticDataset and still getting this same problem, I also want the results to look like

{'pan_results': array([[133, 133, 133, ..., 133, 133, 133], [133, 133, 133, ..., 133, 133, 133], [133, 133, 133, ..., 133, 133, 133], ..., [133, 133, 133, ..., 133, 133, 133], [133, 133, 133, ..., 133, 133, 133], [133, 133, 133, ..., 133, 133, 133]], dtype=int32), 'ins_results': ([array([], shape=(0, 5), dtype=float32), array([], shape=(0, 5), dtype=float32),

I am using mmdet 2.28

Did you manage to solve it?

open-mmlab / mmdetection