wanghao9610 / OV-DINO

Official implementation of OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion
https://wanghao9610.github.io/OV-DINO
Apache License 2.0
258 stars 14 forks source link

ImportError: Cannot import 'detrex._C', therefore 'MultiScaleDeformableAttention' is not available. detrex is not compiled successfully, please build following the instructions! #23

Closed SummerLam closed 3 months ago

SummerLam commented 3 months ago

when i tried to run the sh scripts/demo.sh process, i got this error of failing to import detrex._C and i followed the instruction online of how to fix this issue which is by setting the cuda_home path, tho my cuda_home is not none, i ran the command (export CUDA_HOME=/usr/local/cuda) anyway. However, it didnt resolve the issue. therefore, i would like to seek help here to see what else can be done to fix the error.

Screenshot 2024-08-16 155332
wanghao9610 commented 3 months ago

You need to rebuild the detrex package. You can refer to the following code:

# Suppose you have set root_dir env_var and have installed the requirements on ovdino conda env.
cd $root_dir/ovdino
conda activate ovdino
python -m pip install -e detectron2-717ab9
pip install -e ./

If the above processing failed, you may need to recreate a new env following the installation guideline.

SummerLam commented 3 months ago

image i got this error when i run the commands: python -m pip install -e detectron2-717ab9 pip install -e ./

so this is probably the reason? yet i have recreate a new env but i encountered the same error whenever i tried running these 2 commands. i would like to know if there is any solutions for it?

wanghao9610 commented 3 months ago

@SummerLam I need more info to debug your issue, please give me more error information.

wanghao9610 commented 3 months ago

It looks like that the gcc version is too slow, you could type the following command:

which gcc; gcc --version

Try to install gcc by conda:

conda install gcc=9 gxx=9 -c conda-forge -y # Optional: install gcc9

Then recompile the detecton2 and detrex package.

SummerLam commented 3 months ago

image Requirement already satisfied: portalocker in /opt/conda/envs/ovdino/lib/python3.10/site-packages (from iopath<0.1.10,>=0.1.7->detectron2==0.6) (2.10.1) image image image image image this is the captured error message after running the command python -m pip install -e detectron2-717ab9

wanghao9610 commented 3 months ago

The error is your cuda not install correctly, you can get help from the issue.

SummerLam commented 3 months ago

ok, thank you so much for the information provided and your help. i will read that through.

SummerLam commented 3 months ago

image Hi, i have followed the steps here https://github.com/wanghao9610/OV-DINO/issues/14#issuecomment-2277355677 tried specifying the cuda_home path again and also tried running the command conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.6 -c pytorch -c nvidia -y. However, this issue still exists.

wanghao9610 commented 3 months ago

The figure shows that your installed cuda version is 12.1, while your torch cuda version is 11.6, the two cudas are not match well. You need to install cuda 11.6 manually or install higher version pytorch with cuda12.1(not recommend, may introduce more issue).

SummerLam commented 3 months ago

I see, thank you so much for your kind assistance.

SummerLam commented 3 months ago

Hi, i think i am facing this same issue again even though i have had cuda11.6 installed, what else did i miss? image image

wanghao9610 commented 3 months ago

Do you have compiled detectron2 and detrex successfully? The error info shows that your detrex is not build successfully, as the deformable_conv operator is not installed.

Cloud65000 commented 3 months ago

I encountered this problem after run " python3 setup.py build --force develop" ![Uploading Problem.jpg…]()

Cloud65000 commented 3 months ago

I encountered this problem after run " python3 setup.py build --force develop" ![Uploading Problem.jpg…]()

Cloud65000 commented 3 months ago

The content of the error is shown like this : on2/layers/csrc/vision.cpp:111 Exception raised from registerLibrary at ../aten/src/ATen/core/dispatch/Dispatcher.cpp:180 (most recent call first): frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x57 (0x7f50e0223617 in /usr/local/python3.9.18/lib/python3.9/site-packages/torch/lib/libc10.so) frame #1: c10::detail::torchCheckFail(char const, char const, unsigned int, std::string const&) + 0x64 (0x7f50e01de98d in /usr/local/python3.9.18/lib/python3.9/site-packages/torch/lib/libc10.so) frame #2: c10::Dispatcher::registerLibrary(std::string, std::string) + 0x3c5 (0x7f507944cb05 in /usr/local/python3.9.18/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so) frame #3: torch::Library::Library(torch::Library::Kind, std::string, c10::optional, char const*, unsigned int) + 0x40c (0x7f507948c2dc in /usr/local/python3.9.18/lib/python3.9/site-packages/torch/lib/libtorch_cpu.so) frame #4: + 0x1b90b (0x7f5020fd990b in /home/cxc/OV-DINO/OV-DINO/ovdino/detrex/_C.cpython-39-x86_64-linux-gnu.so) frame #5: + 0x11b9a (0x7f50e4fceb9a in /lib64/ld-linux-x86-64.so.2) frame #6: + 0x11ca1 (0x7f50e4fceca1 in /lib64/ld-linux-x86-64.so.2) frame #7: _dl_catch_exception + 0xe5 (0x7f50e4b54ba5 in /lib/x86_64-linux-gnu/libc.so.6) frame #8: + 0x160cf (0x7f50e4fd30cf in /lib64/ld-linux-x86-64.so.2) frame #9: _dl_catch_exception + 0x88 (0x7f50e4b54b48 in /lib/x86_64-linux-gnu/libc.so.6) frame #10: + 0x1560a (0x7f50e4fd260a in /lib64/ld-linux-x86-64.so.2) frame #11: + 0x134c (0x7f50e49cc34c in /lib/x86_64-linux-gnu/libdl.so.2) frame #12: _dl_catch_exception + 0x88 (0x7f50e4b54b48 in /lib/x86_64-linux-gnu/libc.so.6) frame #13: _dl_catch_error + 0x33 (0x7f50e4b54c13 in /lib/x86_64-linux-gnu/libc.so.6) frame #14: + 0x1b59 (0x7f50e49ccb59 in /lib/x86_64-linux-gnu/libdl.so.2) frame #15: dlopen + 0x4a (0x7f50e49cc3da in /lib/x86_64-linux-gnu/libdl.so.2)

. The image of the error has been uploaded for a very long time. I'm not sure you can see the picture.
wanghao9610 commented 3 months ago

1) I can't see the picture you uploaded. 2) The error looks like you didn't activate enviroment well, or you don't use the conda? If you haven't installed anaconda, you may install anaconda first... 3) The most possiable reson is pytorch installation error, you need first to test the pytorch installation:

python -c "import torch; print(torch.__version__); print(torch.cuda.is_available())"
SummerLam commented 3 months ago
  1. I can't see the picture you uploaded.
  2. The error looks like you didn't activate enviroment well, or you don't use the conda? If you haven't installed anaconda, you may install anaconda first...
  3. The most possiable reson is pytorch installation error, you need first to test the pytorch installation:
python -c "import torch; print(torch.__version__); print(torch.cuda.is_available())"

the issue has been fixed, after specifying the cuda path to be cuda-11.6, we have to run the command pip install -e ./ again and then it can work

SummerLam commented 3 months ago

Thank you for your help and attention

Cloud65000 commented 3 months ago

I didn't use conda.

  1. I can't see the picture you uploaded.
  2. The error looks like you didn't activate enviroment well, or you don't use the conda? If you haven't installed anaconda, you may install anaconda first...
  3. The most possiable reson is pytorch installation error, you need first to test the pytorch installation:
python -c "import torch; print(torch.__version__); print(torch.cuda.is_available())"

I didn't use conda. I'm in the docker container with cuda2.1 and torch12.1. I'll try to use conda and strictly follow the README and try again.