statgen / popscle

A suite of population scale analysis tools for single-cell genomics data including implementation of Demuxlet / Freemuxlet methods and auxilary tools
https://github.com/statgen/popscle/wiki
Apache License 2.0
43 stars 16 forks source link

Posterior values always zero or one #66

Open ccrobertson opened 1 year ago

ccrobertson commented 1 year ago

When I run demuxlet, I get the following:

SNG.POSTERIOR is always equal to 1 (even with DROPLET.TYPE==DBL) BEST.POSTERIOR is always a negative number

From reading other github issues, it looks like the negative values for BEST.POSTERIOR is a bug, and the posterior probability of the best guess is actually PP = e^BEST.POSTERIOR. But even when I do this transformation, this results in

PP=1 for all barcodes with DROPLET.TYPE==DBL PP=0 for all barcodes with DROPLET.TYPE==SNG

Has anyone else seen this?

hyunminkang commented 1 year ago

There is a bug in the current calculation of BEST.POSTERIOR. The bug affects doublets and singlets differently, so probably not useful to rely on it until the bug is fixed. I usually prefer using DIFF.LLK.BEST.NEXT to see how strong the evidence of current inference is.

ccrobertson commented 1 year ago

Thanks! That is really helpful.

By chance, do you know if this bug is relevant to the original demuxlet repository (https://github.com/statgen/demuxlet)? I know there are folks in our group using both versions.

tfguinan commented 1 year ago

Hi, we've come across this too (albeit alongside troubleshooting high doublets; and in apptainer), just wondering if anyone has identified a previous commit in either repo without the calculation bug?