mlr-org / mlr3pipelines

Dataflow Programming for Machine Learning in R
https://mlr3pipelines.mlr-org.com/
GNU Lesser General Public License v3.0
132 stars 25 forks source link

Support for tSNE: t-Distributed Stochastic Neighbour Embeddings #756

Open m-muecke opened 6 months ago

m-muecke commented 6 months ago

Implementation can be found in the Rtsne package https://github.com/jkrijthe/Rtsne. But there is no transform/predict method available, see the discussion here https://github.com/jkrijthe/Rtsne/issues/6 and from the FAQ:

"Once I have a t-SNE map, how can I embed incoming test points in that map? t-SNE learns a non-parametric mapping, which means that it does not learn an explicit function that maps data from the input space to the map. Therefore, it is not possible to embed test points in an existing map (although you could re-run t-SNE on the full dataset). A potential approach to deal with this would be to train a multivariate regressor to predict the map location from the input data. Alternatively, you could also make such a regressor minimize the t-SNE loss directly, which is what I did in this paper.