statgen / popscle

A suite of population scale analysis tools for single-cell genomics data including implementation of Demuxlet / Freemuxlet methods and auxilary tools
https://github.com/statgen/popscle/wiki
Apache License 2.0
43 stars 16 forks source link

freemuxlet assigns all barcode to one sample #53

Open jjenny opened 2 years ago

jjenny commented 2 years ago

Hi,

When I try using freemuxlet to demultiplex a mixture of 2 samples, it assigns all the barcodes to one sample. What's the best way to go about figuring out if this is due to lack of snps, sequencing depth, or possibly a bug with the way I ran freemuxlet?

Here's an example from the freemuxlet output:

INT_ID BARCODE NUM.SNPS NUM.READS DROPLET.TYPE BEST.GUESS BEST.LLK NEXT.GUESS NEXT.LLK DIFF.LLK.BEST.NEXT BEST.POSTERIOR SNG.POSTERIOR SNG.BEST.GUESS SNG.BEST.LLK SNG.NEXT.GUESS SNG.NEXT.LLK SNG.ONLY.POSTERIOR DBL.BEST.GUESS DBL.BEST.LLK DIFF.LLK.SNG.DBL

0 ACGTAGTCAAAGACTA-1 9845 11351 SNG 0,0 -15584.50 1,0 -18314.79 2730.28 0.00000 1 0 -15584.50 1 -24376.39 1.00000 1,0 -18314.79 2730.28

1 TAACACGCACATTGTG-1 6067 13857 SNG 0,0 -9586.95 1,0 -11263.37 1676.43 0.00000 1 0 -9586.95 1 -14987.84 1.00000 1,0 -11263.37 1676.43

2 ATCCACCAGTCTTGGT-1 6799 7530 SNG 0,0 -10880.45 1,0 -12692.26 1811.82 0.00000 1 0 -10880.45 1 -16931.36 1.00000 1,0 -12692.26 1811.82

3 TCGTCCATCACTTGTT-1 9165 10194 SNG 0,0 -14673.21 1,0 -17100.02 2426.81 0.00000 1 0 -14673.21 1 -22682.44 1.00000 1,0 -17100.02 2426.81

4 AGAGAGCAGATCACTC-1 2250 2538 SNG 0,0 -3629.62 1,0 -4211.94 582.33 0.00000 1 0 -3629.62 1 -5554.07 1.00000 1,0 -4211.94 582.33

hyunminkang commented 2 years ago

It probably is having a hard time to find anchor sample. If you increase # samples to 3, 4, or greater, what does happen?

Hyun Min Kang, Ph.D. Professor of Biostatistics University of Michigan, Ann Arbor Email : @.***

On Mon, Mar 7, 2022 at 4:38 PM Jenny Chen @.***> wrote:

Hi,

When I try using freemuxlet to demultiplex a mixture of 2 samples, it assigns all the barcodes to one sample. What's the best way to go about figuring out if this is due to lack of snps, sequencing depth, or possibly a bug with the way I ran freemuxlet?

Here's an example from the freemuxlet output:

INT_ID BARCODE NUM.SNPS NUM.READS DROPLET.TYPE BEST.GUESS BEST.LLK NEXT.GUESS NEXT.LLK DIFF.LLK.BEST.NEXT BEST.POSTERIOR SNG.POSTERIOR SNG.BEST.GUESS SNG.BEST.LLK SNG.NEXT.GUESS SNG.NEXT.LLK SNG.ONLY.POSTERIOR DBL.BEST.GUESS DBL.BEST.LLK DIFF.LLK.SNG.DBL 0 ACGTAGTCAAAGACTA-1 9845 11351 SNG 0,0 -15584.50 1,0 -18314.79 2730.28 0.00000 1 0 -15584.50 1 -24376.39 1.00000 1,0 -18314.79 2730.28 1 TAACACGCACATTGTG-1 6067 13857 SNG 0,0 -9586.95 1,0 -11263.37 1676.43 0.00000 1 0 -9586.95 1 -14987.84 1.00000 1,0 -11263.37 1676.43 2 ATCCACCAGTCTTGGT-1 6799 7530 SNG 0,0 -10880.45 1,0 -12692.26 1811.82 0.00000 1 0 -10880.45 1 -16931.36 1.00000 1,0 -12692.26 1811.82 3 TCGTCCATCACTTGTT-1 9165 10194 SNG 0,0 -14673.21 1,0 -17100.02 2426.81 0.00000 1 0 -14673.21 1 -22682.44 1.00000 1,0 -17100.02 2426.81 4 AGAGAGCAGATCACTC-1 2250 2538 SNG 0,0 -3629.62 1,0 -4211.94 582.33 0.00000 1 0 -3629.62 1 -5554.07 1.00000 1,0 -4211.94 582.33

— Reply to this email directly, view it on GitHub https://github.com/statgen/popscle/issues/53, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABPY5ONQNTEXJRHL56RZRN3U6ZZL7ANCNFSM5QEQTIJA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you are subscribed to this thread.Message ID: @.***>

jjenny commented 2 years ago

Unfortunately I get the same output (everything assigned to the same cluster)

Also, what is the --init-cluster option for? I can't find an explanation for it in the documentation.

hyunminkang commented 2 years ago

I am not sure if I can provide more assistance with the limited information.

(1) Is it possible that there is only one sample in the pool? (2) Are you providing the proper allele frequencies matching the population? (3) Is it possible that the data has a lot of ambient RNAs? (4) Do you have genotypes by any chance (then you can run demuxlet)?

I think one needs to look at the actual data and try to figure it out.

Thanks, Hyun.

Hyun Min Kang, Ph.D. Professor of Biostatistics University of Michigan, Ann Arbor Email : @.***

On Tue, Mar 8, 2022 at 11:36 AM Jenny Chen @.***> wrote:

Unfortunately I get the same output (everything assigned to the same cluster)

Also, what is the --init-cluster option for? I can't find an explanation for it in the documentation.

— Reply to this email directly, view it on GitHub https://github.com/statgen/popscle/issues/53#issuecomment-1061972976, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABPY5OK2ZE3OUCSTPEY2QC3U6562HANCNFSM5QEQTIJA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you commented.Message ID: @.***>