open-mmlab / OpenUnReID

PyTorch open-source toolbox for unsupervised or domain adaptive object re-ID.
Apache License 2.0
396 stars 67 forks source link

Ask for help for solving faiss assertion error #54

Open cd70zyx opened 2 years ago

cd70zyx commented 2 years ago

Dear Author, I guess I have met an obstacle about the wrong environment dependency. My machine is equipped with two RTX3090s, CUDA11.4, Ubuntu 20.04.3 LTS. Every time when I run "main.py" in the directory "tools/MMT", as long as it reaches the step of computing the jaccard distance, the error below occurs:

""" Computing jaccard distance... bruteForceKnn is deprecated; call bfKnn instead Faiss assertion 'err__ == cudaSuccess' failed in void faiss::gpu::runL2Norm(faiss::gpu::Tensor<T, 2, true, IndexType>&, bool, faiss::gpu::Tensor<float, 1, true, IndexType>&, bool, cudaStream_t) [with T = float; TVec = float4; IndexType = int; cudaStream_t = CUstream_st*] at gpu/impl/L2Norm.cu:292; details: CUDA error 8 invalid device function

Process finished with exit code 134 (interrupted by signal 6: SIGABRT) """

I feel so weak to fix this frustrating error, since I do not have enough experience of using the module "faiss-gpu", and there is hardly relevant solution published on the Internet where my search methods could cover. I wonder what the possible bug is? How can I solve the problem please? Looking forward to your instruction, with best regards, thanks!

Emersonzc commented 2 years ago

i have fixed it according https://github.com/yxgeee/SpCL/issues/36#issue-844375384