lmweber / FlowSOM-Rtsne-example

Worked example showing how to cluster and visualize flow/mass cytometry data using FlowSOM and Rtsne
MIT License
17 stars 13 forks source link

FlowSOM-Rtsne example

This repository contains a worked example showing how to cluster and visualize a mass cytometry (CyTOF) data set, using FlowSOM for clustering and Rtsne for visualization.

FlowSOM

FlowSOM is an R/Bioconductor package for clustering flow cytometry and mass cytometry (CyTOF) data (see paper and Bioconductor package). The clustering algorithm is based on self-organizing maps and hierarchical consensus meta-clustering.

We previously showed that FlowSOM performs very well for clustering high-dimensional CyTOF data, and in particular has extremely fast runtimes (see paper published in Cytometry Part A and code repository on GitHub).

Rtsne and t-SNE

Rtsne is an R implementation of the popular t-SNE algorithm (see t-SNE algorithm page, Rtsne development page, and Rtsne package on CRAN).

The t-SNE algorithm projects high-dimensional data to 2 or 3 dimensions for visualization. This is conceptually similar to principal component analysis (PCA). However, the t-SNE algorithm is non-linear (while PCA is linear), making t-SNE much better suited for many types of biological data.

On a t-SNE plot of flow or mass cytometry data, points "near" to each other can be interpreted as belonging to the same or similar cell populations. However, the precise distances in the plot are not meaningful, so care should be taken not to over-interpret the plot. The algorithm also has a random start, so unless a random seed is used (as in this example), each run will look slightly different.

Contents

The repository contains the following files.