Open achirkin opened 2 years ago
This code in the RBC algorithm uses existing RAFT primitives to sample some small number of items from an input matrix without replacement and it should be useful for implementing number 5 above (constructing a k-means training set by sampling rows from X w/o replacement).
This issue has been labeled inactive-30d
due to no recent activity in the past 30 days. Please close this issue if no further response or action is needed. Otherwise, please respond with a comment indicating any updates or changes to the original issue and/or confirm this issue still needs to be addressed. This issue will be labeled inactive-90d
if there is no activity in the next 60 days.
You can mark some items as done:
A few issues and potential points for improvement emerged while integrating ivf-flat approximate kNN (https://github.com/rapidsai/raft/pull/652):
[ ] 1. Padding of the data dimensions We copy the data in ivf_flat while building the index, thus we can pad the data dimensionality
dim
to any vector size. This would improve the search performance compared to the current approach of adapting the vector lengthveclen
to thedim
https://github.com/rapidsai/raft/blob/e9c0d49943a8c010d19e78a87bb70b1dadfc85ff/cpp/include/raft/spatial/knn/detail/ann_ivf_flat.cuh#L130-L132_Originally posted by @tfeher in https://github.com/rapidsai/raft/pull/652#discussion_r890970098_
[ ] 2. Consider refactoring away managed allocations in balanced k-means At the moment, predict accesses the pointers on the device only, but adjust_centers accesses the pointers on the host only. It would make sense to change the latter to work on the device as well, or switch to explicitly copying the data.
[x] 3. Consider improving
raft::linalg::rowNorm
At the moment, the raft's version is slower than the helpers in the PR https://github.com/rapidsai/raft/blob/e9c0d49943a8c010d19e78a87bb70b1dadfc85ff/cpp/include/raft/spatial/knn/detail/ann_utils.cuh#L255-L262In progress: https://github.com/rapidsai/raft/pull/1011
[x] 4. Make more flexible versions of matrix primitives At the moment, some of the helper functions in ann_utils.cuh cannot be replaced with the matching counterparts in raft, because they require different input and output types.
[ ] 5. Use a proper sampling in build_optimized_kmeans At the moment, we use simple cudaMemcpy2DAsync, a sampling may be a more robust solution.
[ ] 6. Python wrapper Current version needs to be updated. Shall we move it from cuml to raft along the way?
[ ] 7. MetricProcessor MetricProcessor seems to aim at two things: (1) improve performance (speed and quality?) by normalizing the data for some metrics and (2) extend support of metrics without modifying the main kernels (e.g. cosine and dot product similarity are the same for normalized data). However, it modifies input data in place, which may sometimes be avoided. I think, we should investigate this: (a) try to avoid modifying data, (b) check where it is really needed for performance (1).
[ ] 8. Processing
NaN
/missing entries At this moment, we don't do anything special aboutNaN
values. Some potential downstream projects (e.g. faiss), as well as end-users may need this. During search, we could impute missing entries in the data using the center vectors of the corresponding clusters. For building, we'd need something more complicated to correctly calculate cluster centers.[x] 9. Investigate possible rare issues with recall values when
n_probes == n_lists
The recall should always be 1.0 in this case; currently workarounded https://github.com/rapidsai/raft/pull/766[ ] 10. Reduce the overheads of the launch configuration logic for IVF-PQ The logic of selecting the launch parameter in IVF-PQ has become rather complicated and may incur a measurable overhead for a small enough work size. In particular, repeated calls to
cudaGetDeviceProperties
are taxing. Consider re-organizing the batching logic to perform the configuration at most once and cachingcudaDeviceProp
in the raft handle._Source: https://github.com/rapidsai/raft/pull/926#discussion_r1019580406_
[x] 11. Migrate to mdspan-based API Use
mdspan
instead of raw pointers to handle input/output data._Source: https://github.com/rapidsai/raft/pull/926#discussion_r1023049567_