RAFT contains fundamental widely-used algorithms and primitives for machine learning and information retrieval. The algorithms are CUDA-accelerated and form building blocks for more easily writing high performance applications.
Describe the bug
IVF-PQ build of wiki_all_1M fails on Grace/H200 with
`CUDA Exception: Warp Illegal Address
Thread 1 "RAFT_IVF_PQ_ANN" received signal CUDA_EXCEPTION_14, Warp Illegal Address.
[Switching focus to CUDA kernel 0, grid 39271, block (821,0,0), thread (160,0,0), device 0, sm 0, warp 15, lane 0]
0x00004002dba2b8b0 in raft::neighbors::ivf_pq::detail::process_and_fill_codes_kernel<256u, 8u, long><<<(8192,1,1),(256,1,1)>>> ()
at /home/scratch.mfoerster_gpu/raft_ws/raft/cpp/include/raft/neighbors/detail/ivf_pq_build.cuh:1164 in _ZN4raft9neighbors6ivf_pq6detail14encode_vectorsILj32ElEclElj inlined from ivf_pq_codepacking.cuh:166
1164 auto t = in_vectors(i, j, k) - pq_centers(partition_ix, k, l);
`
Steps/Code to reproduce bug
The IVF-PQ index build fails both standalone and within cagra.
Describe the bug IVF-PQ build of wiki_all_1M fails on Grace/H200 with
`CUDA Exception: Warp Illegal Address
Thread 1 "RAFT_IVF_PQ_ANN" received signal CUDA_EXCEPTION_14, Warp Illegal Address. [Switching focus to CUDA kernel 0, grid 39271, block (821,0,0), thread (160,0,0), device 0, sm 0, warp 15, lane 0] 0x00004002dba2b8b0 in raft::neighbors::ivf_pq::detail::process_and_fill_codes_kernel<256u, 8u, long><<<(8192,1,1),(256,1,1)>>> () at /home/scratch.mfoerster_gpu/raft_ws/raft/cpp/include/raft/neighbors/detail/ivf_pq_build.cuh:1164 in _ZN4raft9neighbors6ivf_pq6detail14encode_vectorsILj32ElEclElj inlined from ivf_pq_codepacking.cuh:166 1164 auto t = in_vectors(i, j, k) - pq_centers(partition_ix, k, l); `
Steps/Code to reproduce bug The IVF-PQ index build fails both standalone and within cagra.
RAFT_IVF_PQ_ANN_BENCH --build --force --data_prefix=<datasets> --benchmark_filter=raft_ivf_pq.d64-nlist16K wiki_all_1M.json
Expected behavior Finish benchmark without crash
Environment details: