Add genetic demultiplexing comparison

jashapiro commented 2 years ago

This PR adds a notebook for some quick comparisons between cellhash multiplexing and genetic multiplexing.

The notebook is designed to be run from the 15a-render-demux.R script, which loops through a set of samples for which we have both cellhash data and genetic demux results from cellsnp/vireo.

The comments at the moment are minimal, partly because the nature of running the same notebook on multiple samples makes individual commentary difficult.

That said, the overall result is that genetic demux seems to work for many samples, and the results based on comparison to UMAP clusters seem pretty consistent.

Some analysis that might be missing is exploration of the potential relationship between sequencing depth and call quality, or examination of different thresholds for making calls.

One more technical note is that Seurat::HTODemux() fails a lot (not in the example sample, but others), so we might want to take some time to explore why that is, and whether changing thresholds for DropletUtils::HashedDrops() improves it at all. (I have little confidence in Seurat::MULTIseqDemux() and might eliminate it from this comparison)

jashapiro commented 2 years ago

I've updated this a bunch, mostly to simplify it quite a bit, but also to add some commentary about the comparisons that were done. I expect it still will need a bit more polishing, but it would be nice to have another look.

jashapiro commented 2 years ago

I think I addressed all of the questions raised, and added a note in the comments about the Rscript, which I am not including the results from here.

AlexsLemonade / alsf-scpca

Add genetic demultiplexing comparison #165