VISION-SJTU / USOT

[ICCV2021] Learning to Track Objects from Unlabeled Videos
63 stars 7 forks source link

ImportError: Can not compile Precise RoI Pooling library. #2

Closed chenrxi closed 2 years ago

chenrxi commented 2 years ago

sorry, how can I solve this bug?

zhengjilai commented 2 years ago

This error happens if you fail to compile the Precise RoI Pooling library (./lib/models/prroi_pool).

This library is directly collected from PreciseRoIPooling, so you may refer to that repository for more details. Under the environments we advise (CUDA-10, CUDA-11), the library should work for both training and testing without any preprocessing and offline compilation. It will be imported successfully when it is referred in the code, as long as your environment is setup correctly. So I advise you recheck your environment carefully.

chenrxi commented 2 years ago

do you conduct the experiments on 4 3090GPU?

zhengjilai commented 2 years ago

Yes. We train our model on 4 * 3090 (with CUDA-11.1). Experiments on 2080Ti (CUDA 10.0/10.2) also run well. Nothing unexpected happens dealing with the Precise RoI Pooling library. The models trained on 3090 can also be tested and evaluated well on 2080Ti machines.

zhengjilai commented 2 years ago

This issue may be closely related to your question. I believe this library (Precise RoI Pooling) can work well on 3090 with pytorch 1.7. Make sure your environment is setup correctly. Don't forget to revise the cudatoolkit version in ./preprocessing/install_model.sh.

laisimiao commented 2 years ago

@zhengjilai my pytorch version is 1.8, but I have this error on 3090 with cuda11.1

/anaconda3/envs/xxx/lib/python3.7/site-packages/torch/include/ATen/cuda/CUDAContext.h:5:10: fatal error: cuda_runtime_api.h:
zhengjilai commented 2 years ago

@laisimiao

@zhengjilai my pytorch version is 1.8, but I have this error on 3090 with cuda11.1

/anaconda3/envs/xxx/lib/python3.7/site-packages/torch/include/ATen/cuda/CUDAContext.h:5:10: fatal error: cuda_runtime_api.h:

I checked my 4*3090 machine just now, with conda list. The command returns the following row.

pytorch       1.7.1     py3.7_cuda11.0.221_cudnn8.0.5_0    pytorch

However, the CUDA environment I install at that time seems to be CUDA 11.1 (checked with nvcc -V). Such a strange mixture indeed works.

Cuda compilation tools, release 11.1, V11.1.74
Build cuda_11.1.TC455_06.29069683_0

As for pytorch 1.7/1.8, I do not think it matters for compiling the PrPool library. I have once run PrPool with pytorch 1.8 + CUDA 10.2, and it works without any errors.

zhengjilai commented 2 years ago

@laisimiao This issue is very closely related to your explicit situation. Please check that issue carefully. The issue proposer actually succeeded in compiling PrPool with CUDA-11.2.

laisimiao commented 2 years ago

https://github.com/VISION-SJTU/USOT/issues/2#issuecomment-1106614847 Yes, it's so confusing.

DUT-LiMing commented 2 years ago

do you solve this error @laisimiao