hemberg-lab / scRNA.seq.course

Analysis of single cell RNA-seq data course
https://www.singlecellcourse.org
GNU General Public License v3.0
670 stars 360 forks source link

Clustering and Marker Gene Identification: SC3 vs Seurat #151

Closed igordot closed 2 years ago

igordot commented 5 years ago

In section 10.4, you have the following notes:

Clustering and Marker Gene Identification

  • ≤ 5000 cells : SC3
  • > 5000 cells : Seurat

I haven't seen that elsewhere. Is there a specific reason for the difference?

mhemberg commented 5 years ago

SC3 scales poorly with the number of cells in your sample and after 5k cells it is quite slow and requires significant amounts of memories. Seurat is significantly faster, but according to our benchmarks it is less accurate for small datasets. If you have a fast machine or little patience you may want to adjust the recommended threshold.

igordot commented 5 years ago

Thank you for clarifying. I also noticed poor clustering with low cell numbers, but it could be just due to the data itself. Are your benchmarks internal or is it possible to see the differences between SC3 and Seurat clustering somewhere?

tallulandrews commented 5 years ago

Some of the benchmarks are available in the SC3 paper, see: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5410170/ Though Seurat has been updated a few times since then, but our piecemeal checking of the above on newer versions suggest the updates haven't radically improved its performance on small datasets.