Open choltz95 opened 3 years ago
Hi @choltz95 Thanks for pointing out the scalability issue. The nnk_demo.py code was written as a proof of concept for visualizing the graphs obtained with NNK vs KNN.
For large-scale experiments, I would suggest using the nnk function API in faiss_nnk_neighbors.py
or if optimality is not crucial, the neighborhood definition from approximate_nnk
folder.
Note that, these functions, however, require installing faiss package.
Thank you very much for releasing the code for your paper!
Not serious, but I encounter some scalability issues at pairwise distance computation for > 10k points due to
create_distance_matrix
graph_utils.py L8.Using pdist resolved my issue. Could also do numpy vectorize/broadcasting/einsum to avoid sp dependency. Another "nice-to-have" would be preservation of kernel sparsity (i.e. no dense nxn matrices in memory).