rapidsai / raft

RAFT contains fundamental widely-used algorithms and primitives for machine learning and information retrieval. The algorithms are CUDA-accelerated and form building blocks for more easily writing high performance applications.
https://docs.rapids.ai/api/raft/stable/
Apache License 2.0
747 stars 189 forks source link

Remove the shared state and the mutex from NVTX internals #2310

Closed achirkin closed 4 months ago

achirkin commented 4 months ago

Until now, raft has stored a map of NVTX colors (annotation -> color) to avoid using the same color for different annotations and keep using the same color for the same annotations. This map is a shared state. During an extensive ANN_BENCH throughput testing it has turned out that the mutex guarding the map can sometimes become a bottleneck when the number of concurrent threads is really large (>~ 256). This PR replaces the unordered map and the mutex guarding it with a deterministic hash value of the annotation instead (which is stateless).

Pros:

Cons:

achirkin commented 4 months ago

/merge