satijalab / seurat

R toolkit for single cell genomics
http://www.satijalab.org/seurat
Other
2.29k stars 915 forks source link

FindClusters algorithm #280

Closed igordot closed 6 years ago

igordot commented 6 years ago

The primary Seurat functions tend to have a good explanation either in the documentation or in the various vignettes. It seems like the FindClusters() algorithm parameter is important, but I could not find much info on the different options.

According to the docs:

Algorithm for modularity optimization (1 = original Louvain algorithm; 2 = Louvain algorithm with multilevel refinement; 3 = SLM algorithm).

For a full description of the algorithms, see Waltman and van Eck (2013) The European Physical Journal B.

The referenced paper is introducing SLM, so it might be a bit biased:

Based on an analysis involving 13 small and medium sized networks and six large and very large networks (with up to 40 million nodes and up to 800 million edges), we conclude that our SLM algorithm consistently outperforms the original Louvain algorithm and the Louvain algorithm with multilevel refinement.

There are also some previous discussions here (https://github.com/satijalab/seurat/issues/75 and https://github.com/satijalab/seurat/issues/214) that suggest that SLM may be better, but Louvain is still the default in v2.1. Is that primarily to reduce computing time or are there other drawbacks to SLM?

satijalab commented 6 years ago

The choice is up to the user. We use the Louvain default because this is the default from ModularityOptimizer.