elixir-nx / scholar

Traditional machine learning on top of Nx
Apache License 2.0
409 stars 44 forks source link

Add manifold version of LargeVis #272

Open josevalim opened 3 months ago

josevalim commented 3 months ago

We already have the building blocks for ANN. @krstopro, how much work do you believe it is to also have it for manifold/dimensionality reduction?

krstopro commented 3 months ago

Probably not much (I still need to see), though I am worried about memory cost of current implementations. LargeVis dimensionality reduction uses large num_neighbors (150 in the paper) and RandomProjectionForest.predict/2 takes $O(QTK)$ memory for number of queries $Q$, number of trees $T$, and number of neighbors $K$. In any case, leave it to me.

By the way, thanks for the shout-out on the blog. :)

krstopro commented 2 months ago

Apparently, this is much harder than I thought. One of the issues is sampling the negative edges of the k-NN graph, i.e. edges that are not in the graph. To do this, the authors allocate and iterate over a rather large table. I am not sure about this approach and I don't think I can come up with something efficiently implementable using Nx.

josevalim commented 2 months ago

Why do you think it would be hard to implement it? The parr that seems complex is the head traversal, but if we can assume a max depth (not sure if that’s the case), it should be parallelizable.

krstopro commented 2 months ago

It's not really about head traversal, more about allocating a table of size 1e8. I'm still checking if this can be avoided.

krstopro commented 2 months ago

The problem above might be avoidable, but there is another one that might not be. Namely, the method involves symmetrising the k-NN graph and this requires the computation of reverse k-NN graph which I don't think I can implement efficiently within Nx.

Screenshot 2024-06-14 at 19 03 13
josevalim commented 1 month ago

The symmetrized seems to be doable with a while loop over i,j?

krstopro commented 1 month ago

Even if it is possible, it would be expensive as we would need to loop over $n \times k$ edges. A bigger issue is how to store the weights. $p{j | i}$ is $n \times k$ matrix (or sparse $n \times n$ matrix) since we take $k$-nearest neighbors for each of $n$ points; each of $n$ nodes in $k$-NN graph has degree of $k$. $p{i | j}$ requires reversing the $k$-NN graph and the degree of every node can vary from $0$ (the point is not among $k$-nearest neighbors of any other point) to $n - 1$ (the point is contained among $k$-nearest neighbors of every other point). I cannot think of a way to represent the weights with a tensor.