Usage between swappedDrops and chimericDrops ...

MarioniLab / DropletUtils

Clone of the Bioconductor repository for the DropletUtils package.

https://bioconductor.org/packages/devel/bioc/html/DropletUtils.html

56 stars 27 forks source link

Usage between swappedDrops and chimericDrops ... #86

Open drkoryjohns opened 2 years ago

drkoryjohns commented 2 years ago

Hello,

Thank you for making available the DropletUtils tool suite. Question please. I am interested in employing your tools as part of pre-processing in the following order: swappedDrops -> chimericDrops -> emptyDrops -> maximumAmbience -> downsampleReads. However, swappedDrops and chimericDrops both require the mol.info as input then output a cleaned count matrix. How can I correct for swapped drops then chimeric drops when both tools require the same starting input? This seems to be the case for removeAmbience. While, emptyDrops requires the ambient estimate to identify and remove empty drops, hence why I am looking at using maximumAmbience after emptyDrops. But, need clarity and guidance on usage between swappedDrops and chimericDrops. How can I employ both in order that is correct that ensures pre-processing corrects for these events that you describe as important to correct for.

Thanks in advance for your time and feedback.

Best,

Kory Johnson

drkoryjohns commented 2 years ago

I saw this reply in prior dated issue:

For barcode swapping: the swappedDrops just loads the data from the molinfo file and then calls removeSwappedDrops. The latter function doesn't care where the data comes from, so you can just call it directly with your data. The same idea applies with chimericDrops, which just calls removeChimericDrops.

I will explore doing this.

LTLA commented 2 years ago

I don't think it's possible to run both of them. I don't use chimericDrops other than for testing, so it was never a consideration.

If you really, really, need to do this, I suppose you could call the internal DropletUtils:::find_swapped or DropletUtils:::find_chimeric functions. The first output should be a vector specifying the molecules that are not swapped/chimeric; you can use that to filter the molecule information vectors for input into the next function.

If that is useful, I may consider adding an option to return these logical vectors. Otherwise the :::'d functions should be considered internal and subject to change so use at your own risk.

drkoryjohns commented 2 years ago

Thanks for getting back to me so quickly. I will proceed with: swappedDrops -> emptyDrops -> remove ambient RNA using soupX using the post swappedDrops matrix as "raw" and the post emptyDrops as "filtered" -> downsampleMatrix -> Seurat SCT reference integration workflow -> perform cluster over range of resolution -> clustree to pick subjective best resolution -> perform doublet cluster and doublet curation -> repeat clustering using MultiK over range for non-subjective resolution selection and final clustering, marker selection, characterization. Want to use DropletUtils upfront of course to prevent garbage in garbage out. I saw and understand the reply to another post you provided regarding downsample vs scaling. Will see to what extent downsampling impacts vs scaling. Any additional comments are most welcome. Thanks again. Best, Kory