MarioniLab / scran

Clone of the Bioconductor repository for the scran package.
https://bioconductor.org/packages/devel/bioc/html/scran.html
40 stars 22 forks source link

Add 'known doublet' mode to doubletCells() #56

Closed PeteHaitch closed 4 years ago

PeteHaitch commented 4 years ago

Following up from our Slack convo:

any thoughts on a way to tell scran::doubletCells() 'i know these droplets are doublets, treat them accordingly'? or another way of approaching a dataset where i know some droplets are doublets (based on HTOs and genotypes) and want to use 'guilt by assocation' like suggested in https://osca.bioconductor.org/doublet-detection.html#doublet-detection-in-multiplexed-experiments?

LTLA commented 4 years ago

It is done: 14849bf106f840b8cc9df766a54211cbf3d7454d. Note that it literally just skips the simulation step, replacing the simulated doublets with the known doublets and continuing on with the same score calculation.

I toyed with the idea of allowing users to pass in a separate matrix of counts for doublets. This might still be possible but it needs careful consideration about the interaction with size.factors.norm. I suspect it will not be safe to have both a separate matrix and size.factors.norm supplied.

LTLA commented 4 years ago

I ended up taking out known.doublets in favor of an explicit doubletRecovery() function, to avoid the assumption of random doublet formation used in the doubletCells() score calculation. Enjoy.

PeteHaitch commented 4 years ago

Cheers, I'll take it for a spin.