statgen / demuxlet

Genetic multiplexing of barcoded single cell RNA-seq
Apache License 2.0
117 stars 25 forks source link

--alpha 0.3 for 3 samples pooled data? #29

Open x811zou opened 6 years ago

x811zou commented 6 years ago

Hi! In your webpage, it is said that for 2 sample pooled data, we should use "--alpha 0 --alpha 0.5" to better estimate the doublets. I wonder whether we should use "--alpha 0 --alpha 0.3" for 3 sample pooled data?

LuckyMD commented 5 years ago

I am wondering the same for a mixture of 12 samples. I seem to be getting a lot of doublets (only 5% singlets) using default values. Could this have to do with the parameter settings?

hyunminkang commented 5 years ago

I believe that the default parameter setting has already been changed. To make sure, use --alpha 0 --alpha 0.5, but not sure if that will fix the problems.

If you see a lot of doublets, see whether (1) if you are also seeing a lot of ambiguous (AMB) calls, then there might not be sufficient number of SNP-overlapping reads due to various reasons. (2) If you are seeing too many doublets, but not many AMB calls, it is a more strange case, and we need to see whether this is due to some parameter settings or due to experimental reasons. I can provide more detailed help if you email me ( hmkang@umich.edu) with more detailed information.

Thanks, Hyun.

On Tue, Jan 22, 2019 at 11:16 AM MalteDLuecken notifications@github.com wrote:

I am wondering the same for a mixture of 12 samples. I seem to be getting a lot of doublets (only 5% singlets) using default values. Could this have to do with the parameter settings?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/statgen/demuxlet/issues/29#issuecomment-456460028, or mute the thread https://github.com/notifications/unsubscribe-auth/AF-Ouf_1D8xwQerCIw4EFXyQZWEWpWq9ks5vFzlHgaJpZM4VPPYh .

castaway1990 commented 5 years ago

Hi, I am experiencing the same behaviour with 6 mixed samples ( ~95% DBLs with few AMBs), and still observing very high doublets percentage (> 85 %) in synthetically created BAM files from multiple single Sample sequencings. I tried to tweak "alpha" parameter without success. Any suggestions? Thanks

LuckyMD commented 5 years ago

In my case it was a misinterpretation of the --min-reads or --min-umi parameter, which I had set to 500 (far too high). This is the number of reads overlapping with SNPs, not the total number of reads. That's why I had been selecting for doublets.