MzeroMiko / VMamba

VMamba: Visual State Space Models,code is based on mamba
MIT License
2.21k stars 143 forks source link

Got the following issue with selective scan (cd kernels/selective_scan && pip install .) #297

Closed chira98 closed 2 months ago

chira98 commented 2 months ago
Processing /userhomes/17/CD/VMamba/kernels/selective_scan
  Preparing metadata (setup.py) ... done
Requirement already satisfied: torch in /userhomes/17/anaconda3/envs/vmamba/lib/python3.10/site-packages (from selective_scan==0.0.2) (2.2.1)
Requirement already satisfied: packaging in /userhomes/7/.local/lib/python3.10/site-packages (from selective_scan==0.0.2) (24.1)
Requirement already satisfied: ninja in /userhomes/17/.local/lib/python3.10/site-packages (from selective_scan==0.0.2) (1.11.1.1)
Requirement already satisfied: einops in /userhomes/17/.local/lib/python3.10/site-packages (from selective_scan==0.0.2) (0.8.0)
Requirement already satisfied: filelock in /userhomes/17/.local/lib/python3.10/site-packages (from torch->selective_scan==0.0.2) (3.16.0)
Requirement already satisfied: typing-extensions>=4.8.0 in /userhomes/17/.local/lib/python3.10/site-packages (from torch->selective_scan==0.0.2) (4.12.2)
Requirement already satisfied: sympy in /userhomes/17/.local/lib/python3.10/site-packages (from torch->selective_scan==0.0.2) (1.13.2)
Requirement already satisfied: networkx in /userhomes/17/.local/lib/python3.10/site-packages (from torch->selective_scan==0.0.2) (3.3)
Requirement already satisfied: jinja2 in /userhomes/17/anaconda3/envs/vmamba/lib/python3.10/site-packages (from torch->selective_scan==0.0.2) (3.1.4)
Requirement already satisfied: fsspec in /userhomes/17/.local/lib/python3.10/site-packages (from torch->selective_scan==0.0.2) (2024.9.0)
Requirement already satisfied: MarkupSafe>=2.0 in /userhomes/17/anaconda3/envs/vmamba/lib/python3.10/site-packages (from jinja2->torch->selective_scan==0.0.2) (2.1.3)
Requirement already satisfied: mpmath<1.4,>=1.1.0 in /userhomes/17/.local/lib/python3.10/site-packages (from sympy->torch->selective_scan==0.0.2) (1.3.0)
Building wheels for collected packages: selective_scan
  Building wheel for selective_scan (setup.py) ... error
  error: subprocess-exited-with-error

  × python setup.py bdist_wheel did not run successfully.
  │ exit code: 1
  ╰─> [56 lines of output]

      A module that was compiled using NumPy 1.x cannot be run in
      NumPy 2.1.1 as it may crash. To support both 1.x and 2.x
      versions of NumPy, modules must be compiled with NumPy 2.0.
      Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

      If you are a user of the module, the easiest solution will be to
      downgrade to 'numpy<2' or try to upgrade the affected module.
      We expect that some modules will need time to support NumPy 2.

      Traceback (most recent call last):  File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/userhomes/17/CD/VMamba/kernels/selective_scan/setup.py", line 17, in <module>
          import torch
        File "/userhomes/17/anaconda3/envs/vmamba/lib/python3.10/site-packages/torch/__init__.py", line 1477, in <module>
          from .functional import *  # noqa: F403
        File "/userhomes/17/anaconda3/envs/vmamba/lib/python3.10/site-packages/torch/functional.py", line 9, in <module>
          import torch.nn.functional as F
        File "/userhomes/17/anaconda3/envs/vmamba/lib/python3.10/site-packages/torch/nn/__init__.py", line 1, in <module>
          from .modules import *  # noqa: F403
        File "/userhomes/17/anaconda3/envs/vmamba/lib/python3.10/site-packages/torch/nn/modules/__init__.py", line 35, in <module>
          from .transformer import TransformerEncoder, TransformerDecoder, \
        File "/userhomes/17/anaconda3/envs/vmamba/lib/python3.10/site-packages/torch/nn/modules/transformer.py", line 20, in <module>
          device: torch.device = torch.device(torch._C._get_default_device()),  # torch.device('cpu'),
      /userhomes/17/anaconda3/envs/vmamba/lib/python3.10/site-packages/torch/nn/modules/transformer.py:20: UserWarning: Failed to initialize NumPy: _ARRAY_API not found (Triggered internally at /opt/conda/conda-bld/pytorch_1708025847130/work/torch/csrc/utils/tensor_numpy.cpp:84.)
        device: torch.device = torch.device(torch._C._get_default_device()),  # torch.device('cpu'),

      torch.__version__  = 2.2.1

      CUDA_HOME = /userhomes/17/anaconda3/envs/vmamba

      CUDA version:  12.1
      running bdist_wheel
      /userhomes/17/anaconda3/envs/vmamba/lib/python3.10/site-packages/torch/utils/cpp_extension.py:500: UserWarning: Attempted to use ninja as the BuildExtension backend but we could not find ninja.. Falling back to using the slow distutils backend.
        warnings.warn(msg.format('we could not find ninja.'))
      running build
      running build_ext
      /userhomes/17/anaconda3/envs/vmamba/lib/python3.10/site-packages/torch/utils/cpp_extension.py:425: UserWarning: There are no g++ version bounds defined for CUDA version 12.1
        warnings.warn(f'There are no {compiler_name} version bounds defined for CUDA version {cuda_str_version}')
      building 'selective_scan_cuda_oflex' extension
      creating build/temp.linux-x86_64-cpython-310
      creating build/temp.linux-x86_64-cpython-310/csrc
      creating build/temp.linux-x86_64-cpython-310/csrc/selective_scan
      creating build/temp.linux-x86_64-cpython-310/csrc/selective_scan/cusoflex
      /userhomes/17/anaconda3/envs/vmamba/bin/nvcc -I/userhomes/17/CD/VMamba/kernels/selective_scan/csrc/selective_scan -I/userhomes/17/anaconda3/envs/vmamba/lib/python3.10/site-packages/torch/include -I/userhomes/17/anaconda3/envs/vmamba/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/userhomes/17/anaconda3/envs/vmamba/lib/python3.10/site-packages/torch/include/TH -I/userhomes/17/anaconda3/envs/vmamba/lib/python3.10/site-packages/torch/include/THC -I/userhomes/17/anaconda3/envs/vmamba/include -I/userhomes/17/anaconda3/envs/vmamba/include/python3.10 -c csrc/selective_scan/cusoflex/selective_scan_core_bwd.cu -o build/temp.linux-x86_64-cpython-310/csrc/selective_scan/cusoflex/selective_scan_core_bwd.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options '-fPIC' -O3 -std=c++17 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_BFLOAT16_OPERATORS__ -U__CUDA_NO_BFLOAT16_CONVERSIONS__ -U__CUDA_NO_BFLOAT162_OPERATORS__ -U__CUDA_NO_BFLOAT162_CONVERSIONS__ --expt-relaxed-constexpr --expt-extended-lambda --use_fast_math --ptxas-options=-v -lineinfo -gencode arch=compute_70,code=sm_70 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_90,code=sm_90 --threads 4 -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -DTORCH_EXTENSION_NAME=selective_scan_cuda_oflex -D_GLIBCXX_USE_CXX11_ABI=0
      /usr/include/cub/detail/device_synchronize.cuh(53): error: identifier "__cudaDeviceSynchronizeDeprecationAvoidance" is undefined
            result = __cudaDeviceSynchronizeDeprecationAvoidance();
                     ^

      1 error detected in the compilation of "csrc/selective_scan/cusoflex/selective_scan_core_bwd.cu".
      error: command '/userhomes/17/anaconda3/envs/vmamba/bin/nvcc' failed with exit code 255
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for selective_scan
  Running setup.py clean for selective_scan
Failed to build selective_scan
ERROR: ERROR: Failed to build installable wheels for some pyproject.toml based projects (selective_scan)

I found similar issues and solutions, but none of them worked for me. I use following versions, Cuda=12.1 python= 3.10.13 pytorch=2.2.1 torchvision=0.17.1 torchaudio=2.2.1

Please help on how to resolve this issue. Thank you!

MzeroMiko commented 2 months ago

can you downgrade your numpy into 1.x?

chira98 commented 2 months ago

@MzeroMiko Thank you for the response, still got the following issue

Processing /userhomes/17/CD/VMamba/kernels/selective_scan
  Preparing metadata (setup.py) ... done
Requirement already satisfied: torch in /userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages (from selective_scan==0.0.2) (2.2.1)
Requirement already satisfied: packaging in /userhomes/17/.local/lib/python3.10/site-packages (from selective_scan==0.0.2) (24.1)
Requirement already satisfied: ninja in /userhomes/17/.local/lib/python3.10/site-packages (from selective_scan==0.0.2) (1.11.1.1)
Requirement already satisfied: einops in /userhomes/17/.local/lib/python3.10/site-packages (from selective_scan==0.0.2) (0.8.0)
Requirement already satisfied: filelock in /userhomes/17/.local/lib/python3.10/site-packages (from torch->selective_scan==0.0.2) (3.16.0)
Requirement already satisfied: typing-extensions>=4.8.0 in /userhomes/17/.local/lib/python3.10/site-packages (from torch->selective_scan==0.0.2) (4.12.2)
Requirement already satisfied: sympy in /userhomes/17/.local/lib/python3.10/site-packages (from torch->selective_scan==0.0.2) (1.13.2)
Requirement already satisfied: networkx in /userhomes/17/.local/lib/python3.10/site-packages (from torch->selective_scan==0.0.2) (3.3)
Requirement already satisfied: jinja2 in /userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages (from torch->selective_scan==0.0.2) (3.1.4)
Requirement already satisfied: fsspec in /userhomes/17/.local/lib/python3.10/site-packages (from torch->selective_scan==0.0.2) (2024.9.0)
Requirement already satisfied: MarkupSafe>=2.0 in /userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages (from jinja2->torch->selective_scan==0.0.2) (2.1.3)
Requirement already satisfied: mpmath<1.4,>=1.1.0 in /userhomes/17/.local/lib/python3.10/site-packages (from sympy->torch->selective_scan==0.0.2) (1.3.0)
Building wheels for collected packages: selective_scan
  Building wheel for selective_scan (setup.py) ... error
  error: subprocess-exited-with-error

  × python setup.py bdist_wheel did not run successfully.
  │ exit code: 1
  ╰─> [106 lines of output]

      torch.__version__  = 2.2.1

      CUDA_HOME = /userhomes/17/anaconda3/envs/vmamba_env

      CUDA version:  12.1
      running bdist_wheel
      running build
      running build_ext
      /userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/torch/utils/cpp_extension.py:425: UserWarning: There are no g++ version bounds defined for CUDA version 12.1
        warnings.warn(f'There are no {compiler_name} version bounds defined for CUDA version {cuda_str_version}')
      building 'selective_scan_cuda_oflex' extension
      creating /userhomes/17/CD/VMamba/kernels/selective_scan/build
      creating /userhomes/17/CD/VMamba/kernels/selective_scan/build/temp.linux-x86_64-cpython-310
      creating /userhomes/17/CD/VMamba/kernels/selective_scan/build/temp.linux-x86_64-cpython-310/csrc
      creating /userhomes/17/CD/VMamba/kernels/selective_scan/build/temp.linux-x86_64-cpython-310/csrc/selective_scan
      creating /userhomes/17/CD/VMamba/kernels/selective_scan/build/temp.linux-x86_64-cpython-310/csrc/selective_scan/cusoflex
      Emitting ninja build file /userhomes/17/CD/VMamba/kernels/selective_scan/build/temp.linux-x86_64-cpython-310/build.ninja...
      Compiling objects...
      Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)
      [1/3] /userhomes/17/anaconda3/envs/vmamba_env/bin/nvcc --generate-dependencies-with-compile --dependency-output /userhomes/17/CD/VMamba/kernels/selective_scan/build/temp.linux-x86_64-cpython-310/csrc/selective_scan/cusoflex/selective_scan_core_fwd.o.d -I/userhomes/17/CD/VMamba/kernels/selective_scan/csrc/selective_scan -I/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/torch/include -I/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/torch/include/TH -I/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/torch/include/THC -I/userhomes/17/anaconda3/envs/vmamba_env/include -I/userhomes/17/anaconda3/envs/vmamba_env/include/python3.10 -c -c /userhomes/17/CD/VMamba/kernels/selective_scan/csrc/selective_scan/cusoflex/selective_scan_core_fwd.cu -o /userhomes/17/CD/VMamba/kernels/selective_scan/build/temp.linux-x86_64-cpython-310/csrc/selective_scan/cusoflex/selective_scan_core_fwd.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -std=c++17 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_BFLOAT16_OPERATORS__ -U__CUDA_NO_BFLOAT16_CONVERSIONS__ -U__CUDA_NO_BFLOAT162_OPERATORS__ -U__CUDA_NO_BFLOAT162_CONVERSIONS__ --expt-relaxed-constexpr --expt-extended-lambda --use_fast_math --ptxas-options=-v -lineinfo -gencode arch=compute_70,code=sm_70 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_90,code=sm_90 --threads 4 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=selective_scan_cuda_oflex -D_GLIBCXX_USE_CXX11_ABI=0 -ccbin gcc
      FAILED: /userhomes/17/CD/VMamba/kernels/selective_scan/build/temp.linux-x86_64-cpython-310/csrc/selective_scan/cusoflex/selective_scan_core_fwd.o
      /userhomes/17/anaconda3/envs/vmamba_env/bin/nvcc --generate-dependencies-with-compile --dependency-output /userhomes/17/CD/VMamba/kernels/selective_scan/build/temp.linux-x86_64-cpython-310/csrc/selective_scan/cusoflex/selective_scan_core_fwd.o.d -I/userhomes/17/CD/VMamba/kernels/selective_scan/csrc/selective_scan -I/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/torch/include -I/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/torch/include/TH -I/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/torch/include/THC -I/userhomes/17/anaconda3/envs/vmamba_env/include -I/userhomes/17/anaconda3/envs/vmamba_env/include/python3.10 -c -c /userhomes/17/CD/VMamba/kernels/selective_scan/csrc/selective_scan/cusoflex/selective_scan_core_fwd.cu -o /userhomes/17/CD/VMamba/kernels/selective_scan/build/temp.linux-x86_64-cpython-310/csrc/selective_scan/cusoflex/selective_scan_core_fwd.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -std=c++17 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_BFLOAT16_OPERATORS__ -U__CUDA_NO_BFLOAT16_CONVERSIONS__ -U__CUDA_NO_BFLOAT162_OPERATORS__ -U__CUDA_NO_BFLOAT162_CONVERSIONS__ --expt-relaxed-constexpr --expt-extended-lambda --use_fast_math --ptxas-options=-v -lineinfo -gencode arch=compute_70,code=sm_70 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_90,code=sm_90 --threads 4 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=selective_scan_cuda_oflex -D_GLIBCXX_USE_CXX11_ABI=0 -ccbin gcc
      /usr/include/cub/detail/device_synchronize.cuh(53): error: identifier "__cudaDeviceSynchronizeDeprecationAvoidance" is undefined
            result = __cudaDeviceSynchronizeDeprecationAvoidance();
                     ^

      1 error detected in the compilation of "/userhomes/17/CD/VMamba/kernels/selective_scan/csrc/selective_scan/cusoflex/selective_scan_core_fwd.cu".
      [2/3] /userhomes/17/anaconda3/envs/vmamba_env/bin/nvcc --generate-dependencies-with-compile --dependency-output /userhomes/17/CD/VMamba/kernels/selective_scan/build/temp.linux-x86_64-cpython-310/csrc/selective_scan/cusoflex/selective_scan_core_bwd.o.d -I/userhomes/17/CD/VMamba/kernels/selective_scan/csrc/selective_scan -I/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/torch/include -I/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/torch/include/TH -I/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/torch/include/THC -I/userhomes/17/anaconda3/envs/vmamba_env/include -I/userhomes/17/anaconda3/envs/vmamba_env/include/python3.10 -c -c /userhomes/17/CD/VMamba/kernels/selective_scan/csrc/selective_scan/cusoflex/selective_scan_core_bwd.cu -o /userhomes/17/CD/VMamba/kernels/selective_scan/build/temp.linux-x86_64-cpython-310/csrc/selective_scan/cusoflex/selective_scan_core_bwd.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -std=c++17 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_BFLOAT16_OPERATORS__ -U__CUDA_NO_BFLOAT16_CONVERSIONS__ -U__CUDA_NO_BFLOAT162_OPERATORS__ -U__CUDA_NO_BFLOAT162_CONVERSIONS__ --expt-relaxed-constexpr --expt-extended-lambda --use_fast_math --ptxas-options=-v -lineinfo -gencode arch=compute_70,code=sm_70 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_90,code=sm_90 --threads 4 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=selective_scan_cuda_oflex -D_GLIBCXX_USE_CXX11_ABI=0 -ccbin gcc
      FAILED: /userhomes/17/CD/VMamba/kernels/selective_scan/build/temp.linux-x86_64-cpython-310/csrc/selective_scan/cusoflex/selective_scan_core_bwd.o
      /userhomes/17/anaconda3/envs/vmamba_env/bin/nvcc --generate-dependencies-with-compile --dependency-output /userhomes/17/CD/VMamba/kernels/selective_scan/build/temp.linux-x86_64-cpython-310/csrc/selective_scan/cusoflex/selective_scan_core_bwd.o.d -I/userhomes/17/CD/VMamba/kernels/selective_scan/csrc/selective_scan -I/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/torch/include -I/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/torch/include/TH -I/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/torch/include/THC -I/userhomes/17/anaconda3/envs/vmamba_env/include -I/userhomes/17/anaconda3/envs/vmamba_env/include/python3.10 -c -c /userhomes/17/CD/VMamba/kernels/selective_scan/csrc/selective_scan/cusoflex/selective_scan_core_bwd.cu -o /userhomes/17/CD/VMamba/kernels/selective_scan/build/temp.linux-x86_64-cpython-310/csrc/selective_scan/cusoflex/selective_scan_core_bwd.o -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -std=c++17 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_BFLOAT16_OPERATORS__ -U__CUDA_NO_BFLOAT16_CONVERSIONS__ -U__CUDA_NO_BFLOAT162_OPERATORS__ -U__CUDA_NO_BFLOAT162_CONVERSIONS__ --expt-relaxed-constexpr --expt-extended-lambda --use_fast_math --ptxas-options=-v -lineinfo -gencode arch=compute_70,code=sm_70 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_90,code=sm_90 --threads 4 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=selective_scan_cuda_oflex -D_GLIBCXX_USE_CXX11_ABI=0 -ccbin gcc
      /usr/include/cub/detail/device_synchronize.cuh(53): error: identifier "__cudaDeviceSynchronizeDeprecationAvoidance" is undefined
            result = __cudaDeviceSynchronizeDeprecationAvoidance();
                     ^

      1 error detected in the compilation of "/userhomes/17/CD/VMamba/kernels/selective_scan/csrc/selective_scan/cusoflex/selective_scan_core_bwd.cu".
      [3/3] g++ -MMD -MF /userhomes/17/CD/VMamba/kernels/selective_scan/build/temp.linux-x86_64-cpython-310/csrc/selective_scan/cusoflex/selective_scan_oflex.o.d -Wno-unused-result -Wsign-compare -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /userhomes/17/anaconda3/envs/vmamba_env/include -fPIC -O2 -isystem /userhomes/17/anaconda3/envs/vmamba_env/include -fPIC -I/userhomes/17/CD/VMamba/kernels/selective_scan/csrc/selective_scan -I/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/torch/include -I/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/torch/include/TH -I/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/torch/include/THC -I/userhomes/17/anaconda3/envs/vmamba_env/include -I/userhomes/17/anaconda3/envs/vmamba_env/include/python3.10 -c -c /userhomes/17/CD/VMamba/kernels/selective_scan/csrc/selective_scan/cusoflex/selective_scan_oflex.cpp -o /userhomes/17/CD/VMamba/kernels/selective_scan/build/temp.linux-x86_64-cpython-310/csrc/selective_scan/cusoflex/selective_scan_oflex.o -O3 -std=c++17 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=selective_scan_cuda_oflex -D_GLIBCXX_USE_CXX11_ABI=0
      ninja: build stopped: subcommand failed.
      Traceback (most recent call last):
        File "/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 2096, in _run_ninja_build
          subprocess.run(
        File "/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/subprocess.py", line 526, in run
          raise CalledProcessError(retcode, process.args,
      subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

      The above exception was the direct cause of the following exception:

      Traceback (most recent call last):
        File "<string>", line 2, in <module>
        File "<pip-setuptools-caller>", line 34, in <module>
        File "/userhomes/17/CD/VMamba/kernels/selective_scan/setup.py", line 143, in <module>
          setup(
        File "/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/setuptools/__init__.py", line 117, in setup
          return distutils.core.setup(**attrs)
        File "/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 184, in setup
          return run_commands(dist)
        File "/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 200, in run_commands
          dist.run_commands()
        File "/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 954, in run_commands
          self.run_command(cmd)
        File "/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/setuptools/dist.py", line 950, in run_command
          super().run_command(command)
        File "/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 973, in run_command
          cmd_obj.run()
        File "/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/wheel/_bdist_wheel.py", line 378, in run
          self.run_command("build")
        File "/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
          self.distribution.run_command(command)
        File "/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/setuptools/dist.py", line 950, in run_command
          super().run_command(command)
        File "/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 973, in run_command
          cmd_obj.run()
        File "/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/setuptools/_distutils/command/build.py", line 135, in run
          self.run_command(cmd_name)
        File "/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
          self.distribution.run_command(command)
        File "/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/setuptools/dist.py", line 950, in run_command
          super().run_command(command)
        File "/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 973, in run_command
          cmd_obj.run()
        File "/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 98, in run
          _build_ext.run(self)
        File "/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 359, in run
          self.build_extensions()
        File "/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 871, in build_extensions
          build_ext.build_extensions(self)
        File "/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 476, in build_extensions
          self._build_extensions_serial()
        File "/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 502, in _build_extensions_serial
          self.build_extension(ext)
        File "/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/setuptools/command/build_ext.py", line 263, in build_extension
          _build_ext.build_extension(self, ext)
        File "/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/setuptools/_distutils/command/build_ext.py", line 557, in build_extension
          objects = self.compiler.compile(
        File "/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 684, in unix_wrap_ninja_compile
          _write_ninja_file_and_compile_objects(
        File "/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1774, in _write_ninja_file_and_compile_objects
          _run_ninja_build(
        File "/userhomes/17/anaconda3/envs/vmamba_env/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 2112, in _run_ninja_build
          raise RuntimeError(message) from e
      RuntimeError: Error compiling objects for extension
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for selective_scan
  Running setup.py clean for selective_scan
Failed to build selective_scan