STAC-USC / PyNNK_graph_construction

Graph construction using Non Negative Kernel regression
MIT License
9 stars 3 forks source link

Nested loops not scalable #1

Open choltz95 opened 3 years ago

choltz95 commented 3 years ago

Thank you very much for releasing the code for your paper!

Not serious, but I encounter some scalability issues at pairwise distance computation for > 10k points due to create_distance_matrix graph_utils.py L8.

Using pdist resolved my issue. Could also do numpy vectorize/broadcasting/einsum to avoid sp dependency. Another "nice-to-have" would be preservation of kernel sparsity (i.e. no dense nxn matrices in memory).

shekkizh commented 3 years ago

Hi @choltz95 Thanks for pointing out the scalability issue. The nnk_demo.py code was written as a proof of concept for visualizing the graphs obtained with NNK vs KNN.

For large-scale experiments, I would suggest using the nnk function API in faiss_nnk_neighbors.py or if optimality is not crucial, the neighborhood definition from approximate_nnk folder. Note that, these functions, however, require installing faiss package.