OpenGVLab / VideoMamba

VideoMamba: State Space Model for Efficient Video Understanding
https://arxiv.org/abs/2403.06977
Apache License 2.0
660 stars 47 forks source link

Error compiling objects for extension pip install -e causal-conv1d #43

Closed cxylotus closed 2 months ago

cxylotus commented 2 months ago

I met some problems with compliling, I think. I installed CUDA in conda image image

more info about error(but I got the g++11):

image image

image

cxylotus commented 2 months ago

my poor English, thanks for authors' great job! I am a vegetable.

Andy1621 commented 2 months ago

Hi! You can also install it by pip install causal-conv1d since I did not make any change.

Andy1621 commented 2 months ago

再者,也欢迎中文提问哈~

cxylotus commented 2 months ago

感谢作者回答,我pip可以成功安装causal-conv1d,但是我编译mamba还是出现了类似的错误。(第一次收到作者在线这么快回复,很感动QAQ)

Andy1621 commented 2 months ago

你的环境是什么呢,这是我的环境

Python: 3.10.4 (main, Mar 31 2022, 08:41:55) [GCC 7.5.0]
CUDA available: False
MUSA available: False
numpy_random_seed: 2147483648
GCC: gcc (GCC) 7.3.0
PyTorch: 2.1.1+cu118
PyTorch compiling details: PyTorch built with:
  - GCC 9.3
  - C++ Version: 201703
  - Intel(R) oneAPI Math Kernel Library Version 2022.2-Product Build 20220804 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v3.1.1 (Git Hash 64f6bcbcbab628e96f33a62c3e975f8535a7bde4)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - LAPACK is enabled (usually provided by MKL)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.8, CUDNN_VERSION=8.7.0, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -D_GLIBCXX_USE_CXX11_ABI=0 -fabi-version=11 -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOROCTRACER -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=old-style-cast -Wno-invalid-partial-specialization -Wno-unused-private-field -Wno-aligned-allocation-unavailable -Wno-missing-braces -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_DISABLE_GPU_ASSERTS=ON, TORCH_VERSION=2.1.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=1, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF,

TorchVision: 0.16.1+cu118
LMDeploy: 0.3.0+
transformers: 4.36.1
gradio: 4.21.0
fastapi: 0.110.0
pydantic: 2.6.0
triton: 2.1.0
cxylotus commented 2 months ago

感谢作者,我捣鼓了半天发现果然还是CUDA编译的问题,已经解决好了(我菜)。 但是现在运行mamba/tests/ops/test_selective_scan.py,出现了以下报错(?) raise GradcheckError( torch.autograd.gradcheck.GradcheckError: Jacobian mismatch for output 0 with respect to input 0, numerical:tensor([[ 0.0000, -610.3516, -488.2812, ..., 0.0000, 0.0000, 0.0000], [ 0.0000, 0.0000, 0.0000, ..., 0.0000, 0.0000, 0.0000], [ 0.0000, 0.0000, 0.0000, ..., 0.0000, 0.0000, 0.0000], ..., [ 0.0000, 0.0000, 0.0000, ..., 0.0000, 0.0000, 0.0000], [ 0.0000, 0.0000, 0.0000, ..., 0.0000, 0.0000, 0.0000], [ 0.0000, 0.0000, 0.0000, ..., 0.0000, 0.0000, 0.0000]], device='cuda:0') analytical:tensor([[ 3.8972e+01, -3.3363e+02, -1.5434e+02, ..., -4.3753e-05, 2.9214e-05, 8.9537e-06], [ 0.0000e+00, 0.0000e+00, 0.0000e+00, ..., -8.8612e-07, -1.8853e-06, 4.7473e-06], [ 0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 1.5964e-08, -9.7847e-09, -6.3036e-09], ..., [ 0.0000e+00, 0.0000e+00, 0.0000e+00, ..., 0.0000e+00, 0.0000e+00, 0.0000e+00], [-0.0000e+00, -0.0000e+00, -0.0000e+00, ..., -0.0000e+00, -0.0000e+00, -0.0000e+00], [ 0.0000e+00, 0.0000e+00, 0.0000e+00, ..., -4.1011e+00, -1.9691e+00, 3.1464e+00]], device='cuda:0')

cxylotus commented 2 months ago

我的环境: sys.platform: linux Python: 3.10.13 (main, Sep 11 2023, 13:44:35) [GCC 11.2.0] CUDA available: True numpy_random_seed: 2147483648 GPU 0,1,2,3,4,5,6,7: NVIDIA GeForce RTX 2080 Ti CUDA_HOME: /home/xxx/anaconda3/envs/mamba NVCC: Cuda compilation tools, release 11.8, V11.8.89 GCC: gcc (Ubuntu 11.4.0-2ubuntu1~20.04) 11.4.0 PyTorch: 2.1.1 PyTorch compiling details: PyTorch built with:

TorchVision: 0.16.1 OpenCV: 4.9.0

Andy1621 commented 2 months ago

这个bug可能得问问mamba作者,在那边起个issue,test_selective_scan.py的逻辑我没改过嘞

cxylotus commented 2 months ago

好的,感谢作者的耐心回答。(鞠躬.jpg (我看里面有test bimamba之类的函数所以以为您改过的

cxylotus commented 2 months ago

再次打扰了作者(*´▽`)ノ,我想请问您的代码您在运行test_selective_scan.py的时候是正常的结果吗? 以及还想了解作者您是根据运行test_selective_scan.py结果无误来确认自己的修改是正确的吗?

Andy1621 commented 2 months ago

我没有跑过这个测试嘞

cxylotus commented 2 months ago

好的好的,打扰您了,再再再次感谢您的耐心回答!(鞠躬

Uaena-cf commented 3 weeks ago

感谢作者回答,我pip可以成功安装causal-conv1d,但是我编译mamba还是出现了类似的错误。(第一次收到作者在线这么快回复,很感动QAQ)

您好,很抱歉打扰你了。我在pip install -e mamba的时候也遇到这个问题了,请问你是怎么解决的呢?非常感谢你!!!