Scalsol / mega.pytorch

Memory Enhanced Global-Local Aggregation for Video Object Detection, CVPR2020
Other
563 stars 115 forks source link

failed running python setup.py build develop for mega.pytorch #116

Open LilyDaytoy opened 1 year ago

LilyDaytoy commented 1 year ago

I followed install.md when running command python setup.py build develop

my nvcc --version is cuda 10.1 I tried conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.1 -c pytorch conda install pytorch==1.5.0 torchvision==0.6.0 cudatoolkit=10.1 -c pytorch
conda install pytorch==1.7.0 torchvision==0.8.0 torchaudio==0.7.0 cudatoolkit=10.1 -c pytorch all cannot

LilyDaytoy commented 1 year ago

raise RuntimeError(message) RuntimeError: Error compiling objects for extension

LilyDaytoy commented 1 year ago

This is my error message: python setup.py build develop running build running build_py running build_ext building 'mega_core._C' extension /mnt/lustre/share/gcc/gcc-5.3.0/bin -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA -I/mnt/lustre/wxpeng/MSG/mega.pytorch/mega_core/csrc -I/mnt/lustre/wxpeng/anaconda3/envs/mega/lib/python3.7/site-packages/torch/include -I/mnt/lustre/wxpeng/anaconda3/envs/mega/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/mnt/lustre/wxpeng/anaconda3/envs/mega/lib/python3.7/site-packages/torch/include/TH -I/mnt/lustre/wxpeng/anaconda3/envs/mega/lib/python3.7/site-packages/torch/include/THC -I/mnt/lustre/share/cuda-10.0/include -I/mnt/lustre/wxpeng/anaconda3/envs/mega/include/python3.7m -c /mnt/lustre/wxpeng/MSG/mega.pytorch/mega_core/csrc/cpu/ROIAlign_cpu.cpp -o build/temp.linux-x86_64-3.7/mnt/lustre/wxpeng/MSG/mega.pytorch/mega_core/csrc/cpu/ROIAlign_cpu.o -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11 error: command '/mnt/lustre/share/gcc/gcc-5.3.0/bin' failed: Permission denied

Could you help me check what is going on? Thanks a lot!

Scalsol commented 1 year ago

The current codebase only works with pytorch 1.3.0 (or lower), as mentioned in the INSTALL.md. So you may try a older version of pytorch.

LilyDaytoy commented 1 year ago

Hi! I tried using exactly conda install pytorch=1.3.0 torchvision cudatoolkit=10.0 -c pytorch, and I also change my nvcc version to 10.0, but still failed like this

python setup.py build develop running build running build_py running build_ext building 'mega_core._C' extension /mnt/lustre/share/gcc/gcc-5.3.0/bin -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DWITH_CUDA -I/mnt/lustre/wxpeng/MSG/mega.pytorch/mega_core/csrc -I/mnt/lustre/wxpeng/anaconda3/envs/mega/lib/python3.7/site-packages/torch/include -I/mnt/lustre/wxpeng/anaconda3/envs/mega/lib/python3.7/site-packages/torch/include/torch/csrc/api/include -I/mnt/lustre/wxpeng/anaconda3/envs/mega/lib/python3.7/site-packages/torch/include/TH -I/mnt/lustre/wxpeng/anaconda3/envs/mega/lib/python3.7/site-packages/torch/include/THC -I/mnt/lustre/share/cuda-10.0/include -I/mnt/lustre/wxpeng/anaconda3/envs/mega/include/python3.7m -c /mnt/lustre/wxpeng/MSG/mega.pytorch/mega_core/csrc/cpu/ROIAlign_cpu.cpp -o build/temp.linux-x86_64-3.7/mnt/lustre/wxpeng/MSG/mega.pytorch/mega_core/csrc/cpu/ROIAlign_cpu.o -DTORCH_API_INCLUDE_EXTENSION_H -DTORCH_EXTENSION_NAME=_C -D_GLIBCXX_USE_CXX11_ABI=0 -std=c++11 error: command '/mnt/lustre/share/gcc/gcc-5.3.0/bin' failed: Permission denied

LilyDaytoy commented 1 year ago

my versions: nvcc: 10.0 gcc: 5.3.0 pytorch 1.3.0

Also for

cd cocoapi/PythonAPI
python setup.py build_ext install

I also encountered this issue: error: command '/mnt/lustre/share/gcc/gcc-5.3.0/bin' failed: Permission denied so I used

conda install -c conda-forge pycocotools

is it ok?

Scalsol commented 1 year ago

It seems that it's a permission issue with your gcc directory. So maybe try to update the folder permissions by chmod.

LilyDaytoy commented 1 year ago

Ohh, thanks a lot! I found the error was actually because of my gcc dir, I add CC=gcc before python setup.py build develop, and the problem is solved. But I encountered another problem, when inferencing the model, there is an error AttributeError: module 'torch.cuda' has no attribute 'amp' (apex/apex/transformer/amp/grad_scaler.py line 21), I searched online and they say amp is only available after pytorch 1.6, but the repo only support pytorch 1.3 and lower, so is there any way to solve this problem? Thanks for your patience :D