CannyLab / tsne-cuda

GPU Accelerated t-SNE for CUDA with Python bindings
BSD 3-Clause "New" or "Revised" License
1.81k stars 130 forks source link

CUDA error 209 no kernel image is available for execution on the device #116

Closed JXuann closed 1 year ago

JXuann commented 1 year ago

Hi, I just installed tsnecuda via conda (on Ubuntu 20).

When running tsnecuda.test(), I encountered this error:

Initializing cuda handles... done.
KNN Computation... Faiss assertion 'err__ == cudaSuccess' failed in void faiss::gpu::runL2Norm(faiss::gpu::Tensor<T, 2, true, IndexType>&, bool, faiss::gpu::Tensor<float, 1, true, IndexType>&, bool, cudaStream_t) [with T = float; TVec = float4; IndexType = int; cudaStream_t = CUstream_st*] at /home/conda/feedstock_root/build_artifacts/faiss-split_1636459943780/work/faiss/gpu/impl/L2Norm.cu:323; details: CUDA error 209 no kernel image is available for execution on the device
Aborted (core dumped)

I looked at the solved old issues and realized it could be compute compatibility. I probably need to define --gencode=arch=compute_50,code=sm_50 (for quadro m series) somewhere. But I do not know where could I specify it or how I should solve this.

Could you please give me some suggestions? Many thanks!

DavidMChan commented 1 year ago

So this is actually an issue with the linked version of FAISS, since we already compile using the 5.0 compute capability (See: https://github.com/CannyLab/tsne-cuda/blob/e565c6bdeae4b15d3b5c3b490e3bee569305f505/CMakeLists.txt#L66). The way to get around this would be by first building FAISS's GPU code with the correct compute capability (which should be possible on CUDA 11), and then going back and linking tsnecuda against the generated FAISS code. You'll basically have to follow the instructions in the docker file to build FAISS, but alter the line here: https://github.com/facebookresearch/faiss/blob/79e74fe3075e494abcc909b4988ef3c3cb059f72/.circleci/Dockerfile.faiss_gpu#L25 to support the correct compute architecture.

DavidMChan commented 1 year ago

I'm going to close this issue, since it's a problem upstream, but I'm happy to answer questions in the replies.

JXuann commented 1 year ago

Hi @DavidMChan ,

Thanks for your quick reply! Much appreciated.

How do I linking tsnecuda against the generated FAISS code?

DavidMChan commented 1 year ago

This should happen automatically when building tsnecuda from scratch with FAISS installed (since this code will pick up the location of the FAISS shared library: https://github.com/CannyLab/tsne-cuda/blob/e565c6bdeae4b15d3b5c3b490e3bee569305f505/CMakeLists.txt#L154

If it's not found, you can specify the location of the faiss libraries by setting the CMAKE variables that are set in this script: https://github.com/CannyLab/tsne-cuda/blob/main/cmake/Modules/FindFAISS.cmake

JXuann commented 1 year ago

Ok, thanks. Just to confirm: I need to completely remove the existing tsnecuda (previously installed via conda), build FAISS-gpu and then build tsnecuda from source (as listed here: https://github.com/CannyLab/tsne-cuda/blob/master/INSTALL.md)?

DavidMChan commented 1 year ago

Yep.

On Fri, Dec 16 2022 at 13:16, Rachael Xi Cheng < @.*** > wrote:

Ok, thanks. Just to confirm: I need to completely remove tsnecuda (previously installed via conda), build FAISS-gpu and then build tsnecuda from source (as listed here: https://github.com/CannyLab/tsne-cuda/blob/master/INSTALL.md )?

— Reply to this email directly, view it on GitHub ( https://github.com/CannyLab/tsne-cuda/issues/116#issuecomment-1355619794 ) , or unsubscribe ( https://github.com/notifications/unsubscribe-auth/AAYK3IVL7GWMTWJYBP2FW2DWNTL2ZANCNFSM6AAAAAATBL5E3A ). You are receiving this because you were mentioned. Message ID: <CannyLab/tsne-cuda/issues/116/1355619794 @ github. com>