dpeerlab / SEACells

SEACells algorithm for Inference of transcriptional and epigenomic cellular states from single-cell genomics data
GNU General Public License v2.0
145 stars 27 forks source link

Handling metacell creation for rare cell types (<< 75 cells) #61

Open uqzqiao opened 6 months ago

uqzqiao commented 6 months ago

Thanks for developing this incredibly inspiring tool!

I am currently analyzing a large scRNA-seq dataset consisting of hundreds of samples, each with ~ 3000 cells. Following the methodology outlined in your paper, I am creating SEACell metacells per sample before integrating all samples. According to the rule of 75 cells per metacell, and after annotating cell types using Azimuth (L2), scPred, or Celltypist—which identified more than 30 distinct cell types—I've noticed that some rare cell types consist of fewer than 10 cells per sample. This situation often results in metacells that capture more than one cell type per individual, thus reducing purity.

In this scenario, how should I approach metacell creation to ensure a balance? Should I limit the creation of metacells to cell types with a larger number of cells? Additionally, did the COVID dataset discussed in your paper contain a greater number of cells per individual, which might minimize this issue?

Thank you for your guidance in advance!