tilschaef / scRNA-seq

From fastq to preprocessed counttable (for in-house CELSeq2 method), with Kallisto | Bustools workflow.
0 stars 0 forks source link

Doublet selection #14

Open Rebecza opened 3 years ago

Rebecza commented 3 years ago

Considering: Especially in droplet-based scRNA-seq methods, there is a relatively higher chance to get doublets in the dataset.

Doublets: are two (or more?) cells encapsulated in the same droplet, where one would expect only a single cell.

CellRanger has a build-in method to estimate the amount of doublets, based on the chances with certain amount of cells loaded in the sample? (Not sure how this works exactly)

And there are several separate methods that try to estimate which droplet/cell entry might have contained doublets (Gert Jan has used one, we could have a look at). I used to only check the nCounts/nFeature distributions over the UMAP, to have an idea if certain clusters are being formed on the basis of these differences (doublets can still be a problem with FACS methods as well, although you select for this in the sorting procedure as well).

However, there is apparently not really a consensus yet in the field, on how to identify these properly.

I do think it is important to include this (in a later release of s2s) in the pipeline, since one does encounter this problem especially in 10X experiments.