czczup / ViT-Adapter

[ICLR 2023 Spotlight] Vision Transformer Adapter for Dense Predictions
https://arxiv.org/abs/2205.08534
Apache License 2.0
1.18k stars 130 forks source link

image_demo Potsdam datasets #45

Open AlanCSU opened 1 year ago

AlanCSU commented 1 year ago

首先很感谢作者您对于图像分割的贡献以及开源代码,但是我在预测Potsdam数据集的时候存在一些小问题,运行代码如下: python segmentation/image_demo.py segmentation/configs/potsdam/mask2former_beit_adapter_large_512_80k_potsdam_ss.py D:\PyCharm_Projects\ViT-Adapter-main\segmentation\pretrained_model\beit_large_patch16_224_pt22k_ft22k.pth D:/downloads/Potsdam/Potsdam/myOutputs/images/2_10_0_0.png

但是这里运行时报错显示: unexpected key in source state_dict: model missing keys in source state_dict: backbone.cls_token, backbone.level_embed, backbone.patch_embed.proj.weight, backbone.patch_embed.proj.bias, backbone.blocks.0.gamma_1, backbone.blocks.0.gamma_2, ... 等一系列backbone,decode_head开头的一系列权重载入

其次还存在这个问题:

Traceback (most recent call last): File "D:\PyCharm_Projects\ViT-Adapter-main\segmentation\image_demo.py", line 58, in main() File "D:\PyCharm_Projects\ViT-Adapter-main\segmentation\image_demo.py", line 45, in main result = inference_segmentor(model, args.img) File "d:\downloads\mmsegmentation-master\mmsegmentation-master\mmseg\apis\inference.py", line 102, in inference_segmentor result = model(return_loss=False, rescale=True, data) File "E:\conda\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl return forward_call(*input, *kwargs) File "E:\conda\lib\site-packages\mmcv\runner\fp16_utils.py", line 116, in new_func return old_func(args, kwargs) File "d:\downloads\mmsegmentation-master\mmsegmentation-master\mmseg\models\segmentors\base.py", line 110, in forward return self.forward_test(img, img_metas, kwargs) File "d:\downloads\mmsegmentation-master\mmsegmentation-master\mmseg\models\segmentors\base.py", line 94, in forward_test return self.aug_test(imgs, img_metas, kwargs) File "D:\PyCharm_Projects\ViT-Adapter-main\segmentation\mmseg_custom\models\segmentors\encoder_decoder_mask2former.py", line 276, in aug_test seg_logit = self.inference(imgs[0], img_metas[0], rescale) File "D:\PyCharm_Projects\ViT-Adapter-main\segmentation\mmseg_custom\models\segmentors\encoder_decoder_mask2former.py", line 240, in inference seg_logit = self.slide_inference(img, img_meta, rescale) File "D:\PyCharm_Projects\ViT-Adapter-main\segmentation\mmseg_custom\models\segmentors\encoder_decoder_mask2former.py", line 180, in slide_inference crop_seg_logit = self.encode_decode(crop_img, img_meta) File "D:\PyCharm_Projects\ViT-Adapter-main\segmentation\mmseg_custom\models\segmentors\encoder_decoder_mask2former.py", line 73, in encode_decode x = self.extract_feat(img) File "D:\PyCharm_Projects\ViT-Adapter-main\segmentation\mmseg_custom\models\segmentors\encoder_decoder_mask2former.py", line 65, in extract_feat x = self.backbone(img) File "E:\conda\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl return forward_call(*input, kwargs) File "D:\PyCharm_Projects\ViT-Adapter-main\segmentation\mmseg_custom\models\backbones\beit_adapter.py", line 115, in forward x, c, cls = layer(x, c, cls, self.blocks[indexes[0]:indexes[-1] + 1], File "E:\conda\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl return forward_call(*input, *kwargs) File "D:\PyCharm_Projects\ViT-Adapter-main\segmentation\mmseg_custom\models\backbones\adapter_modules.py", line 222, in forward x = blk(x, H, W) File "E:\conda\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl return forward_call(input, kwargs) File "D:\PyCharm_Projects\ViT-Adapter-main\segmentation\mmseg_custom\models\backbones\base\beit.py", line 186, in forward x = _inner_forward(x) File "D:\PyCharm_Projects\ViT-Adapter-main\segmentation\mmseg_custom\models\backbones\base\beit.py", line 179, in _inner_forward x = x + self.drop_path(self.gamma_1 self.attn(self.norm1(x), rel_pos_bias=rel_pos_bias)) File "E:\conda\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl return forward_call(input, **kwargs) File "D:\PyCharm_Projects\ViT-Adapter-main\segmentation\mmseg_custom\models\backbones\base\beit.py", line 136, in forward attn = attn + relative_position_bias.unsqueeze(0) RuntimeError: The size of tensor a (257) must match the size of tensor b (1025) at non-singleton dimension 3

如果您有时间可以给我一些指点,我将非常感谢

czczup commented 1 year ago

你好,权重载入的提示是不影响的,

RuntimeError: The size of tensor a (257) must match the size of tensor b (1025) at non-singleton dimension 3

报这个错是因为输入图像的分辨率不对,请问你推理用的图片是数据集里的吗,还是自己的数据? 我在postdom的config里写的配置是:

test_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(
        type='MultiScaleFlipAug',
        img_scale=(2048, 512),
        # img_ratios=[0.5, 0.75, 1.0, 1.25, 1.5, 1.75],
        flip=False,
        transforms=[
            dict(type='Resize', keep_ratio=True),
            dict(type='ResizeToMultiple', size_divisor=32),
            dict(type='RandomFlip'),
            dict(type='Normalize', **img_norm_cfg),
            dict(type='ImageToTensor', keys=['img']),
            dict(type='Collect', keys=['img']),
        ])
]

这要求图像的长宽比必须小于2048/512=4,才能保证短边是512(因为BEiT要求512x512的固定大小的输入)。 所以只需把img_scale=(2048, 512),改成img_scale=(9999, 512),即可

AlanCSU commented 1 year ago

好的 感谢您,预测问题解决了,但是现在训练时又出现了部分问题:

我发现在训练过程中我在detection\ops\setup.py build install 过程中出现了错误,错误如下: (base) D:\PyCharm_Projects\ViT-Adapter-main\detection\ops>python setup.py build install running build running build_pym_Projects\ViT-Adapter-main\detection\ops> running build_ext E:\conda\lib\site-packages\torch\utils\cpp_extension.py:305: UserWarning: Error checking compiler version for cl: [WinError 2] 系统找不到指定的文件。 warnings.warn(f'Error checking compiler version for {compiler}: {error}') building 'MultiScaleDeformableAttention' extension Emitting ninja build file D:\PyCharm_Projects\ViT-Adapter-main\detection\ops\build\temp.win-amd64-cpython-39\Release\build.ninja... Compiling objects... Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N) 1.10.2.git.kitware.jobserver-1 E:\VS2019\VC\Tools\MSVC\14.29.30133\bin\HostX86\x64\link.exe /nologo /INCREMENTAL:NO /LTCG /DLL /MANIFEST:EMBED,ID=2 /MANIFESTUAC:NO /LIBPATH:E:\conda\lib\site-packages\torch\lib "/LIBPATH:C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.6\lib/x64" /LIBPATH:E:\conda\libs /LIBPATH:E:\conda /LIBPATH:E:\conda\PCbuild\amd64 /LIBPATH:E:\VS2019\VC\Tools\MSVC\14.29.30133\ATLMFC\lib\x64 /LIBPATH:E:\VS2019\VC\Tools\MSVC\14.29.30133\lib\x64 "/LIBPATH:C:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\lib\um\x64" "/LIBPATH:E:\Windows Kits\10\lib\10.0.19041.0\ucrt\x64" "/LIBPATH:E:\Windows Kits\10\lib\10.0.19041.0\um\x64" c10.lib torch.lib torch_cpu.lib torch_python.lib cudart.lib c10_cuda.lib torch_cuda_cu.lib torch_cuda_cpp.lib /EXPORT:PyInit_MultiScaleDeformableAttention D:\PyCharm_Projects\ViT-Adapter-main\detection\ops\build\temp.win-amd64-cpython-39\Release\PyCharm_Projects\ViT-Adapter-main\detection\ops\src\cpu\ms_deform_attn_cpu.obj D:\PyCharm_Projects\ViT-Adapter-main\detection\ops\build\temp.win-amd64-cpython-39\Release\PyCharm_Projects\ViT-Adapter-main\detection\ops\src\cuda\ms_deform_attn_cuda.obj D:\PyCharm_Projects\ViT-Adapter-main\detection\ops\build\temp.win-amd64-cpython-39\Release\PyCharm_Projects\ViT-Adapter-main\detection\ops\src\vision.obj /OUT:build\lib.win-amd64-cpython-39\MultiScaleDeformableAttention.cp39-win_amd64.pyd /IMPLIB:D:\PyCharm_Projects\ViT-Adapter-main\detection\ops\build\temp.win-amd64-cpython-39\Release\PyCharm_Projects\ViT-Adapter-main\detection\ops\src\cpu\MultiScaleDeformableAttention.cp39-win_amd64.lib LINK : fatal error LNK1181: 无法打开输入文件“D:\PyCharm_Projects\ViT-Adapter-main\detection\ops\build\temp.win-amd64-cpython-39\Release\PyCharm_Projects\ViT-Adapter-main\detection\ops\src\cuda\ms_deform_attn_cuda.obj” error: command 'E:\VS2019\VC\Tools\MSVC\14.29.30133\bin\HostX86\x64\link.exe' failed with exit code 1181

如果可以,再次向您求助,感谢

czczup commented 1 year ago

因为你的系统好像是Windows, 编译Deformable Attention可能会有问题,你可以参考这个issue,进行一些修改。

I think the deformable attention is not compiled successfully. You can try this: replace line 11

import MultiScaleDeformableAttention as MSDA in the ms_deform_attn_func.py with

from mmcv.ops.multi_scale_deform_attn import ext_module as MSDA and then run python test.py again.

就是直接使用mmcv编译好的Deformable Attention

AlanCSU commented 1 year ago

因为你的系统好像是Windows, 编译Deformable Attention可能会有问题,你可以参考这个issue,进行一些修改。

I think the deformable attention is not compiled successfully. You can try this: replace line 11

import MultiScaleDeformableAttention as MSDA in the ms_deform_attn_func.py with

from mmcv.ops.multi_scale_deform_attn import ext_module as MSDA and then run python test.py again.

就是直接使用mmcv编译好的Deformable Attention

我的系统是windows的,参考了您提供的方案之后我和原方案的提出者出现了一样的问题,我在考虑是否是MSDA.ms_deform_attn_backward这个模块没有编译成功,错误代码如下: File "D:\PyCharm_Projects\ViT-Adapter-main\segmentation\ops\functions\ms_deform_attn_func.py", line 45, in backward

grad_value, grad_sampling_loc, grad_attn_weight = \ MSDA.ms_deform_attn_backward( value, value_spatial_shapes, value_level_start_index, sampling_locations, attention_weights, grad_output, ctx.im2col_step)

TypeError: ms_deform_attn_backward(): incompatible function arguments. The following argument types are supported:

wobukun123 commented 1 year ago

你好,我在复现代码时遇到了跟你相同的问题: import MultiScaleDeformableAttention as MSDA 报错:packages/MultiScaleDeformableAttention-1.0-py3.7-linux-x86_64.egg/MultiScaleDeformableAttention.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZNK2at6Tensor7optionsEv from mmcv.ops.multi_scale_deform_attn import ext_module as MSDA 报错: ms_deform_attn_backward(): incompatible function arguments. The following argument types are supported 请问你现在解决了嘛?

37s commented 8 months ago

你好,我在复现代码时遇到了跟你相同的问题: import MultiScaleDeformableAttention as MSDA 报错:packages/MultiScaleDeformableAttention-1.0-py3.7-linux-x86_64.egg/MultiScaleDeformableAttention.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZNK2at6Tensor7optionsEv from mmcv.ops.multi_scale_deform_attn import ext_module as MSDA 报错: ms_deform_attn_backward(): incompatible function arguments. The following argument types are supported 请问你现在解决了嘛?

请问你现在解决了嘛?