tkonopka / umap

Uniform Manifold Approximation and Projection - R package
Other
132 stars 16 forks source link

Sparse Matrix support #15

Closed altrabio closed 2 years ago

altrabio commented 3 years ago

First of all, thanks for your excellent work.

I was wondering if this version was supporting sparseMatrix as input when using method "umap-learn", and, if it does, please explain how.

Thanks

tkonopka commented 3 years ago

That would be a neat feature. At present, it is not possible.

It would be great to incorporate something like that. I will take a look. Also, please feel free to tinker with the code if you're familiar with this topic. It would be a useful addition.

altrabio commented 3 years ago

It looks like it is quite straightforward, a slight modification of umap.prep.input makes it possible: you just need to replace this part of code :

else { umap.error("input must be a matrix or matrix-compatible\n") }

by something like

else { if( is(d,"Matrix")){ } else umap.error("input must be a matrix or matrix-compatible\n") }

This works at least with "dgCMatrix" from Matrix package and euclidean distance

This also needs some more extensive tests

Once again thanks for your work

tkonopka commented 2 years ago

Got around to some maintenance on the package.

The umap and predict functions should now accept dgCMatrix and dgTMatrix objects.

In the native/naive R implementation, though, the package will convert sparse data into canonical (non-sparse) matrix objects. The conversion can increase the memory footprint of the data, depending on how sparse it is to begin with.

tkonopka commented 2 years ago

Now addressed in release "CRAN v0.2.8.0"