Open RichieHakim opened 2 years ago
I'll look into adding this (though, TBH, I can't promise anything), but I'm also happy to accept a PR to address this.
For future reference (and for anyone who wants to give it a shot), the idea would be to shortcut the logic for nearest neighbors here: https://github.com/CannyLab/tsne-cuda/blob/b740a7d46a07ca9415f072001839fb66a582a3fa/src/fit_tsne.cu#L118
It's not that hard to do, since the rest of the TSNE algorithm only requires a float distance array of size (N x # neighbors) and a similarly shaped array of the nearest neighbor indices.
The logic for passing arrays is already in place (since we handle pre-initialized T-SNE (see how preinit_data) is handled in https://github.com/CannyLab/tsne-cuda/blob/b740a7d46a07ca9415f072001839fb66a582a3fa/src/python/tsnecuda/TSNE.py), and how it's parsed into the actual function call in https://github.com/CannyLab/tsne-cuda/blob/b740a7d46a07ca9415f072001839fb66a582a3fa/src/ext/pymodule_ext.cu
All that would have to be done is to create a new option in the options file (just like the pre-init data), https://github.com/CannyLab/tsne-cuda/blob/b740a7d46a07ca9415f072001839fb66a582a3fa/src/include/options.h, and reference it during the main tsne call.
This is still dearly hoped for.
I have the same requests here.
FEATURE REQUEST:
In https://github.com/CannyLab/tsne-cuda/issues/8, the possibility of using a custom NN matrix is discussed and noted to be 'easy' to implement. DavidMChan: " It would be easy to add the ability to pass in a sparse nearest neighbors matrix, however it becomes more complicated if you want to extract the nearest neighbors from a pre-computed distance matrix."
It would be a significant improvement that would open up a lot of use cases if this were implemented. Specifically: allowing a user to input a custom distance matrix (ie a sparse knn_graph) would be amazing. It would be sufficient for users already familiar with and using this feature in sklearn's TSNE to directly port their workflow to tsne-cuda.
https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html
metricstr or callable, default=’euclidean’: ...If metric is “precomputed”, X is assumed to be a distance matrix. ...
Thanks!