danaugrs / go-tsne

t-Distributed Stochastic Neighbor Embedding (t-SNE) in Go
BSD 3-Clause "New" or "Revised" License
206 stars 25 forks source link
3d data-science dimensionality-reduction go machine-learning tsne unsupervised-learning visualization

go-tsne

A Go implementation of t-Distributed Stochastic Neighbor Embedding (t-SNE), a prize-winning technique for dimensionality reduction particularly well suited for visualizing high-dimensional datasets.

mnist2d mnist3d

Usage

Import this library:

import "github.com/danaugrs/go-tsne/tsne"

Create the TSNE object:

t := tsne.NewTSNE(2, 300, 100, 300, true)

The parameters are

There are two ways to start the t-SNE embedding optimization. The regular way is to provide an n by d matrix where each row is a datapoint and each column is a dimension:

Y := t.EmbedData(X, nil)

The alternative is to provide a distance matrix directly:

Y := t.EmbedDistances(D, nil)

In either case, the returned matrix Y will contain the final embedding.

For more fine-grained control, a step function can be provided in either case:

Y := t.EmbedData(X, func(iter int, divergence float64, embedding mat.Matrix) bool {
  fmt.Printf("Iteration %d: divergence is %v\n", iter, divergence)
  return false
})

The step function has access to the iteration, the current divergence, and the embedding optimized so far. You can return true to halt the optimization.

Examples

Two examples are provided - mnist2d and mnist3d. They both use the same data - a subset of MNIST with 2500 handwritten digits. mnist2d generates plots throughout the optimization process, and mnist3d shows the optimization happening in real-time, in 3D. mnist3d depends on G3N. To run an example, cd to the example's directory, build it, and execute it, e.g:

cd examples/mnist2d
go build
./mnist2d

Support

I hope you enjoy using and learning from go-tsne as much as I enjoyed writing it.

If you come across any issues, please report them.