rapidsai / raft

RAFT contains fundamental widely-used algorithms and primitives for machine learning and information retrieval. The algorithms are CUDA-accelerated and form building blocks for more easily writing high performance applications.
https://docs.rapids.ai/api/raft/stable/
Apache License 2.0
683 stars 180 forks source link

ANN_BENCH: AnnGPU::uses_stream() for optional algo GPU sync #2314

Closed achirkin closed 1 month ago

achirkin commented 1 month ago

Introduce a new virtual member uses_stream() for the AnnGPU class. Overriding this allows an algorithm inform the benchmark whether the stream synchronization is needed between benchmark iterations.

This is relevant for a potential persistent kernel where the CPU threads use an independent mechanics to synchronize and get the results from the GPU. This is different from just not implementing AnnGPU for an algorithm in that it allows the algorithm to decide whether the synchronization is needed (depending on input parameters at runtime), while still providing the get_sync_stream() functionality.

achirkin commented 1 month ago

/merge