czczup / ViT-Adapter

[ICLR 2023 Spotlight] Vision Transformer Adapter for Dense Predictions
https://arxiv.org/abs/2205.08534
Apache License 2.0
1.18k stars 130 forks source link

ImportError #41

Open ccqlx opened 1 year ago

ccqlx commented 1 year ago

Traceback (most recent call last): File "/home/ccq/Data/liu/ViT-Adapter-main/segmentation/train.py", line 11, in import mmseg_custom # noqa: F401,F403 File "/home/ccq/Data/liu/ViT-Adapter-main/segmentation/mmseg_custom/init.py", line 3, in from .models import # noqa: F401,F403 File "/home/ccq/Data/liu/ViT-Adapter-main/segmentation/mmseg_custom/models/init.py", line 2, in from .backbones import # noqa: F401,F403 File "/home/ccq/Data/liu/ViT-Adapter-main/segmentation/mmseg_custom/models/backbones/init.py", line 2, in from .beit_adapter import BEiTAdapter File "/home/ccq/Data/liu/ViT-Adapter-main/segmentation/mmseg_custom/models/backbones/beit_adapter.py", line 9, in from detection.ops.modules import MSDeformAttn File "/home/ccq/Data/liu/ViT-Adapter-main/detection/ops/modules/init.py", line 9, in from .ms_deform_attn import MSDeformAttn File "/home/ccq/Data/liu/ViT-Adapter-main/detection/ops/modules/ms_deform_attn.py", line 19, in from ..functions import MSDeformAttnFunction File "/home/ccq/Data/liu/ViT-Adapter-main/detection/ops/functions/init.py", line 9, in from .ms_deform_attn_func import MSDeformAttnFunction File "/home/ccq/Data/liu/ViT-Adapter-main/detection/ops/functions/ms_deform_attn_func.py", line 11, in import MultiScaleDeformableAttention as MSDA ImportError: /home/ccq/anaconda3/envs/ViT-Adapter-main/lib/python3.8/site-packages/MultiScaleDeformableAttention-1.0-py3.8-linux-x86_64.egg/MultiScaleDeformableAttention.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZNK2at10TensorBase8data_ptrIdEEPT_v

Hello, what is the reason for this error? I want a solution, thanks.

czczup commented 1 year ago

Hi, what is the version of your CUDA? You can use nvcc -V to print the information of CUDA.

ccqlx commented 1 year ago

@czczup CUDA is 11.1

(ViT-Adapter-main) ccq@ccq:~/Data/ViT-Adapter-main$ nvcc -V nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2020 NVIDIA Corporation Built on Tue_Sep_15_19:10:02_PDT_2020 Cuda compilation tools, release 11.1, V11.1.74 Build cuda_11.1.TC455_06.29069683_0

czczup commented 1 year ago

@ccqlx You could run the test.py to check if deformable attention is installed successfully.

Run it like this:

cd detection/ops/
python test.py
ccqlx commented 1 year ago

@czczup I gave it a try and showed the following:

(ViT-Adapter-main) ccq@ccq:~/Data/ViT-Adapter-main/detection/ops$ python test.py Traceback (most recent call last): File "test.py", line 12, in from functions.ms_deform_attn_func import (MSDeformAttnFunction, File "/home/ccq/Data/ViT-Adapter-main/detection/ops/functions/init.py", line 9, in from .ms_deform_attn_func import MSDeformAttnFunction File "/home/ccq/Data/ViT-Adapter-main/detection/ops/functions/ms_deform_attn_func.py", line 11, in import MultiScaleDeformableAttention as MSDA ImportError: /home/ccq/anaconda3/envs/ViT-Adapter-main/lib/python3.8/site-packages/MultiScaleDeformableAttention-1.0-py3.8-linux-x86_64.egg/MultiScaleDeformableAttention.cpython-38-x86_64-linux-gnu.so: undefined symbol: _ZNK2at10TensorBase8data_ptrIdEEPT_v

czczup commented 1 year ago

I think the deformable attention is not compiled successfully. You can try this: replace line 11

import MultiScaleDeformableAttention as MSDA

in the ms_deform_attn_func.py with

from mmcv.ops.multi_scale_deform_attn import ext_module as MSDA

and then run python test.py again.

ccqlx commented 1 year ago

It showed the following:

(ViT-Adapter-main) ccq@ccq:~/Data/ViT-Adapter-main/detection/ops$ python test.py

Invoked with: tensor([[[[2.8306e-03, 9.4753e-03, 9.3084e-03, ..., 6.8304e-03, 8.3203e-03, 1.7387e-03], [8.2308e-03, 3.5872e-03, 3.3102e-03, ..., 5.9733e-03, 9.4355e-03, 4.3745e-03]],

     [[3.6581e-03, 3.2056e-03, 1.2741e-03,  ..., 8.7882e-03,
       5.3017e-03, 6.8271e-04],
      [4.7188e-03, 4.3066e-03, 1.9086e-03,  ..., 4.8850e-03,
       8.5183e-03, 2.6952e-03]],

     [[4.9191e-03, 4.2033e-03, 2.2696e-03,  ..., 8.6269e-03,
       7.5990e-03, 5.2359e-03],
      [8.2033e-03, 4.8610e-03, 9.2209e-04,  ..., 1.3456e-03,
       7.8573e-04, 6.9594e-05]],

     ...,

     [[1.3591e-04, 9.7463e-03, 2.6549e-03,  ..., 5.1311e-03,
       8.3771e-03, 6.5187e-03],
      [9.4567e-03, 2.1441e-03, 3.2841e-03,  ..., 6.4269e-03,
       1.2673e-03, 2.4591e-03]],

     [[5.5316e-03, 8.9207e-03, 3.7639e-03,  ..., 9.2534e-03,
       5.9210e-04, 5.7246e-03],
      [5.2976e-03, 4.3412e-03, 2.0720e-03,  ..., 6.8974e-06,
       9.2739e-03, 1.0133e-03]],

     [[8.3681e-03, 3.6025e-03, 3.9561e-03,  ..., 4.1047e-03,
       6.9461e-03, 5.9292e-04],
      [8.6039e-04, 4.2888e-03, 5.7870e-03,  ..., 8.9754e-03,
       8.3673e-03, 5.7373e-03]]]], device='cuda:0', dtype=torch.float64,
   grad_fn=<CopyBackwards>), tensor([[6, 4],
    [3, 2]], device='cuda:0'), tensor([ 0, 24], device='cuda:0'), tensor([[[[[[0.9039, 0.4670],
        [0.9605, 0.1661]],

       [[0.3754, 0.2202],
        [0.8897, 0.0443]]],

      [[[0.9926, 0.3067],
        [0.1081, 0.2196]],

       [[0.2653, 0.2301],
        [0.0962, 0.9684]]]],

     [[[[0.4655, 0.3431],
        [0.4971, 0.2944]],

       [[0.9427, 0.5881],
        [0.7237, 0.7388]]],

      [[[0.5605, 0.9126],
        [0.4982, 0.3065]],

       [[0.2988, 0.6454],
        [0.5300, 0.7064]]]]]], device='cuda:0', dtype=torch.float64,
   grad_fn=<CopyBackwards>), tensor([[[[[0.2759, 0.2690],
       [0.1009, 0.3542]],

      [[0.3140, 0.2090],
       [0.2103, 0.2667]]],

     [[[0.0828, 0.2179],
       [0.3503, 0.3489]],

      [[0.3138, 0.2989],
       [0.0730, 0.3143]]]]], device='cuda:0', dtype=torch.float64,
   grad_fn=<CopyBackwards>), tensor([[[1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
      0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
      0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
      0., 0., 0., 0., 0., 0., 0., 0., 0.],
     [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
      0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
      0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
      0., 0., 0., 0., 0., 0., 0., 0., 0.]]], device='cuda:0',
   dtype=torch.float64), 2
czczup commented 1 year ago

I think now you can try to run inference with a checkpoint

wobukun123 commented 1 year ago

hi, I had the same problem. Have you solved this problem?

XinzheGeng commented 10 months ago

I think now you can try to run inference with a checkpoint

@czczup 你好,我按照这个修改了代码,然后运行了python test.py,遇到了同样的问题

TypeError: ms_deform_attn_backward(): incompatible function arguments. The following argument types are supported:

  1. (value: at::Tensor, value_spatial_shapes: at::Tensor, value_level_start_index: at::Tensor, sampling_locations: at::Tensor, attention_weights: at::Tensor, grad_output: at::Tensor, grad_value: at::Tensor, grad_sampling_loc: at::Tensor, grad_attn_weight: at::Tensor, im2col_step: int) -> None

这种情况我是可以加载预训练权重去推理的,但是训练时需要反向传播时就会报错,请问要如何解决呢?

xunzha commented 7 months ago

I think now you can try to run inference with a checkpoint

@czczup 你好,我按照这个修改了代码,然后运行了python test.py,遇到了同样的问题

TypeError: ms_deform_attn_backward(): incompatible function arguments. The following argument types are supported:

  1. (value: at::Tensor, value_spatial_shapes: at::Tensor, value_level_start_index: at::Tensor, sampling_locations: at::Tensor, attention_weights: at::Tensor, grad_output: at::Tensor, grad_value: at::Tensor, grad_sampling_loc: at::Tensor, grad_attn_weight: at::Tensor, im2col_step: int) -> None

这种情况我是可以加载预训练权重去推理的,但是训练时需要反向传播时就会报错,请问要如何解决呢?

请问你解决了吗?我也是同样的问题

XinzheGeng commented 7 months ago

I think now you can try to run inference with a checkpoint

@czczup 你好,我按照这个修改了代码,然后运行了python test.py,遇到了同样的问题

TypeError: ms_deform_attn_backward(): incompatible function arguments. The following argument types are supported:

  1. (value: at::Tensor, value_spatial_shapes: at::Tensor, value_level_start_index: at::Tensor, sampling_locations: at::Tensor, attention_weights: at::Tensor, grad_output: at::Tensor, grad_value: at::Tensor, grad_sampling_loc: at::Tensor, grad_attn_weight: at::Tensor, im2col_step: int) -> None

这种情况我是可以加载预训练权重去推理的,但是训练时需要反向传播时就会报错,请问要如何解决呢?

请问你解决了吗?我也是同样的问题 解决了,按readme重新装了容器和虚拟环境,cuda版本要完全一致才可以安装成功

ILoveAkizukiKanna commented 3 months ago

I think now you can try to run inference with a checkpoint

@czczup 你好,我按照这个修改了代码,然后运行了python test.py,遇到了同样的问题

TypeError: ms_deform_attn_backward(): incompatible function arguments. The following argument types are supported:

  1. (value: at::Tensor, value_spatial_shapes: at::Tensor, value_level_start_index: at::Tensor, sampling_locations: at::Tensor, attention_weights: at::Tensor, grad_output: at::Tensor, grad_value: at::Tensor, grad_sampling_loc: at::Tensor, grad_attn_weight: at::Tensor, im2col_step: int) -> None

这种情况我是可以加载预训练权重去推理的,但是训练时需要反向传播时就会报错,请问要如何解决呢?

请问你解决了吗?我也是同样的问题 解决了,按readme重新装了容器和虚拟环境,cuda版本要完全一致才可以安装成功

您好,请问您重装环境后是成功在Windows环境下编译了deformation attention模块吗?还是使用了mmcv中预编译的deformable attention?

XinzheGeng commented 3 months ago

I think now you can try to run inference with a checkpoint

@czczup 你好,我按照这个修改了代码,然后运行了python test.py,遇到了同样的问题

TypeError: ms_deform_attn_backward(): incompatible function arguments. The following argument types are supported:

  1. (value: at::Tensor, value_spatial_shapes: at::Tensor, value_level_start_index: at::Tensor, sampling_locations: at::Tensor, attention_weights: at::Tensor, grad_output: at::Tensor, grad_value: at::Tensor, grad_sampling_loc: at::Tensor, grad_attn_weight: at::Tensor, im2col_step: int) -> None

这种情况我是可以加载预训练权重去推理的,但是训练时需要反向传播时就会报错,请问要如何解决呢?

请问你解决了吗?我也是同样的问题 解决了,按readme重新装了容器和虚拟环境,cuda版本要完全一致才可以安装成功

您好,请问您重装环境后是成功在Windows环境下编译了deformation attention模块吗?还是使用了mmcv中预编译的deformable attention?

我是在linux环境编译成功的,windows没有试过

Githia commented 3 weeks ago

It showed the following:

(ViT-Adapter-main) ccq@ccq:~/Data/ViT-Adapter-main/detection/ops$ python test.py

  • True check_forward_equal_with_pytorch_double: max_abs_err 8.67e-19 max_rel_err 2.35e-16
  • True check_forward_equal_with_pytorch_float: max_abs_err 4.66e-10 max_rel_err 1.13e-07 Traceback (most recent call last): File "test.py", line 109, in check_gradient_numerical(channels, True, True, True) File "test.py", line 96, in check_gradient_numerical gradok = gradcheck( File "/home/ccq/anaconda3/envs/ViT-Adapter-main/lib/python3.8/site-packages/torch/autograd/gradcheck.py", line 1245, in gradcheck return _gradcheck_helper(args) File "/home/ccq/anaconda3/envs/ViT-Adapter-main/lib/python3.8/site-packages/torch/autograd/gradcheck.py", line 1258, in _gradcheck_helper _gradcheck_real_imag(gradcheck_fn, func, func_out, tupled_inputs, outputs, eps, File "/home/ccq/anaconda3/envs/ViT-Adapter-main/lib/python3.8/site-packages/torch/autograd/gradcheck.py", line 930, in _gradcheck_real_imag gradcheck_fn(func, func_out, tupled_inputs, outputs, eps, File "/home/ccq/anaconda3/envs/ViT-Adapter-main/lib/python3.8/site-packages/torch/autograd/gradcheck.py", line 974, in _slow_gradcheck analytical = _check_analytical_jacobian_attributes(tupled_inputs, o, nondet_tol, check_grad_dtypes) File "/home/ccq/anaconda3/envs/ViT-Adapter-main/lib/python3.8/site-packages/torch/autograd/gradcheck.py", line 516, in _check_analytical_jacobian_attributes vjps1 = _compute_analytical_jacobian_rows(vjp_fn, output.clone()) File "/home/ccq/anaconda3/envs/ViT-Adapter-main/lib/python3.8/site-packages/torch/autograd/gradcheck.py", line 608, in _compute_analytical_jacobian_rows grad_inputs = vjp_fn(grad_out_base) File "/home/ccq/anaconda3/envs/ViT-Adapter-main/lib/python3.8/site-packages/torch/autograd/gradcheck.py", line 509, in vjp_fn return torch.autograd.grad(output, diff_input_list, grad_output, File "/home/ccq/anaconda3/envs/ViT-Adapter-main/lib/python3.8/site-packages/torch/autograd/init*.py", line 226, in grad return Variable._execution_engine.run_backward( File "/home/ccq/anaconda3/envs/ViT-Adapter-main/lib/python3.8/site-packages/torch/autograd/function.py", line 87, in apply return self._forward_cls.backward(self, args) # type: ignore[attr-defined] File "/home/ccq/anaconda3/envs/ViT-Adapter-main/lib/python3.8/site-packages/torch/autograd/function.py", line 204, in wrapper outputs = fn(ctx, args) File "/home/ccq/anaconda3/envs/ViT-Adapter-main/lib/python3.8/site-packages/torch/cuda/amp/autocast_mode.py", line 236, in decorate_bwd return bwd(args, **kwargs) File "/home/ccq/Data/ViT-Adapter-main/detection/ops/functions/ms_deform_attn_func.py", line 43, in backward MSDA.ms_deform_attn_backward( TypeError: ms_deform_attn_backward(): incompatible function arguments. The following argument types are supported:

    1. (value: at::Tensor, value_spatial_shapes: at::Tensor, value_level_start_index: at::Tensor, sampling_locations: at::Tensor, attention_weights: at::Tensor, grad_output: at::Tensor, grad_value: at::Tensor, grad_sampling_loc: at::Tensor, grad_attn_weight: at::Tensor, im2col_step: int) -> None

Invoked with: tensor([[[[2.8306e-03, 9.4753e-03, 9.3084e-03, ..., 6.8304e-03, 8.3203e-03, 1.7387e-03], [8.2308e-03, 3.5872e-03, 3.3102e-03, ..., 5.9733e-03, 9.4355e-03, 4.3745e-03]],

     [[3.6581e-03, 3.2056e-03, 1.2741e-03,  ..., 8.7882e-03,
       5.3017e-03, 6.8271e-04],
      [4.7188e-03, 4.3066e-03, 1.9086e-03,  ..., 4.8850e-03,
       8.5183e-03, 2.6952e-03]],

     [[4.9191e-03, 4.2033e-03, 2.2696e-03,  ..., 8.6269e-03,
       7.5990e-03, 5.2359e-03],
      [8.2033e-03, 4.8610e-03, 9.2209e-04,  ..., 1.3456e-03,
       7.8573e-04, 6.9594e-05]],

     ...,

     [[1.3591e-04, 9.7463e-03, 2.6549e-03,  ..., 5.1311e-03,
       8.3771e-03, 6.5187e-03],
      [9.4567e-03, 2.1441e-03, 3.2841e-03,  ..., 6.4269e-03,
       1.2673e-03, 2.4591e-03]],

     [[5.5316e-03, 8.9207e-03, 3.7639e-03,  ..., 9.2534e-03,
       5.9210e-04, 5.7246e-03],
      [5.2976e-03, 4.3412e-03, 2.0720e-03,  ..., 6.8974e-06,
       9.2739e-03, 1.0133e-03]],

     [[8.3681e-03, 3.6025e-03, 3.9561e-03,  ..., 4.1047e-03,
       6.9461e-03, 5.9292e-04],
      [8.6039e-04, 4.2888e-03, 5.7870e-03,  ..., 8.9754e-03,
       8.3673e-03, 5.7373e-03]]]], device='cuda:0', dtype=torch.float64,
   grad_fn=<CopyBackwards>), tensor([[6, 4],
    [3, 2]], device='cuda:0'), tensor([ 0, 24], device='cuda:0'), tensor([[[[[[0.9039, 0.4670],
        [0.9605, 0.1661]],

       [[0.3754, 0.2202],
        [0.8897, 0.0443]]],

      [[[0.9926, 0.3067],
        [0.1081, 0.2196]],

       [[0.2653, 0.2301],
        [0.0962, 0.9684]]]],

     [[[[0.4655, 0.3431],
        [0.4971, 0.2944]],

       [[0.9427, 0.5881],
        [0.7237, 0.7388]]],

      [[[0.5605, 0.9126],
        [0.4982, 0.3065]],

       [[0.2988, 0.6454],
        [0.5300, 0.7064]]]]]], device='cuda:0', dtype=torch.float64,
   grad_fn=<CopyBackwards>), tensor([[[[[0.2759, 0.2690],
       [0.1009, 0.3542]],

      [[0.3140, 0.2090],
       [0.2103, 0.2667]]],

     [[[0.0828, 0.2179],
       [0.3503, 0.3489]],

      [[0.3138, 0.2989],
       [0.0730, 0.3143]]]]], device='cuda:0', dtype=torch.float64,
   grad_fn=<CopyBackwards>), tensor([[[1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
      0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
      0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
      0., 0., 0., 0., 0., 0., 0., 0., 0.],
     [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
      0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
      0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
      0., 0., 0., 0., 0., 0., 0., 0., 0.]]], device='cuda:0',
   dtype=torch.float64), 2

你好,我在复现vit Adapter模型的时候遇到了跟你同样的问题,困扰了我多日,请问,您解决了吗?可否告诉我呢?

Githia commented 3 weeks ago

I think now you can try to run inference with a checkpoint

@czczup 你好,我按照这个修改了代码,然后运行了python test.py,遇到了同样的问题

TypeError: ms_deform_attn_backward(): incompatible function arguments. The following argument types are supported:

  1. (value: at::Tensor, value_spatial_shapes: at::Tensor, value_level_start_index: at::Tensor, sampling_locations: at::Tensor, attention_weights: at::Tensor, grad_output: at::Tensor, grad_value: at::Tensor, grad_sampling_loc: at::Tensor, grad_attn_weight: at::Tensor, im2col_step: int) -> None

这种情况我是可以加载预训练权重去推理的,但是训练时需要反向传播时就会报错,请问要如何解决呢?

请问你解决了吗?我也是同样的问题 解决了,按readme重新装了容器和虚拟环境,cuda版本要完全一致才可以安装成功

您好,请问您重装环境后是成功在Windows环境下编译了deformation attention模块吗?还是使用了mmcv中预编译的deformable attention?

我是在linux环境编译成功的,windows没有试过

你好,有在windows环境下编译deformation attention模块吗?是不是会编译失败?我遇到了同样的问题