statgen / demuxlet

Genetic multiplexing of barcoded single cell RNA-seq
Apache License 2.0
118 stars 25 forks source link

turn off doublet detection #11

Closed acwarren closed 6 years ago

acwarren commented 6 years ago

Is there a way to just infer sample identity and turn off double detection to save computational cost for experiments with higher numbers of samples?

yimmieg commented 6 years ago

hi, is demuxlet slow for you even if you use the —group-list flag?

On Feb 7, 2018, at 1:13 PM, acwarren notifications@github.com wrote:

Is there a way to just infer sample identity and turn off double detection to save computational cost for experiments with higher numbers of samples?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/statgen/demuxlet/issues/11, or mute the thread https://github.com/notifications/unsubscribe-auth/AEUMf_2q9PU6uvMBLZP0VesFlk5ZXGATks5tShH1gaJpZM4R9Zli.

acwarren commented 6 years ago

I am using the --group-list flag, which is helpful - but I have single cells that I expect are from one of 25 samples, but which I would like to test against a possible ~1000 samples - which would be a lot of possible doublet combinations. So the computational cost is not from having larger numbers of cells but from having larger numbers of possible samples

yimmieg commented 6 years ago

got it. we currently don't have a way to turn off the doublet detection. could you provide some motivation for why you want to test against a vcf of 1000 samples? if it's to check the accuracy of demuxlet, you can put in the 25 samples in the pool and perhaps an additional set of 25 samples not in the pool.

~jimmie

On Feb 8, 2018, at 7:18 AM, acwarren notifications@github.com wrote:

I am using the --group-list flag, which is helpful - but I have single cells that I expect are from one of 25 samples, but which I would like to test against a possible ~1000 samples - which would be a lot of possible doublet combinations. So the computational cost is not from having larger numbers of cells but from having larger numbers of possible samples

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/statgen/demuxlet/issues/11#issuecomment-364143595, or mute the thread https://github.com/notifications/unsubscribe-auth/AEUMf32yHGnxdb0P48WKhNYUWivmgaEcks5tSxA1gaJpZM4R9Zli.

acwarren commented 6 years ago

I was interested in doing it for evaluation of accuracy and because there is a possibility that a sample is mislabeled, so cells may actually be from one of the other 1000 samples.

yimmieg commented 6 years ago

ah ok. we will look into supporting turning off doublet detection in the future. for now, we have a few higher order items (tuning for V2 chemistry) that we must look into first.

On Feb 8, 2018, at 10:53 AM, acwarren notifications@github.com wrote:

I was interested in doing it for evaluation of accuracy and because there is a possibility that a sample is mislabeled, so cells may actually be from one of the other 1000 samples.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/statgen/demuxlet/issues/11#issuecomment-364211758, or mute the thread https://github.com/notifications/unsubscribe-auth/AEUMf-2t9nqJG-0vM3DvJpUTdwmIPw6Sks5tS0KUgaJpZM4R9Zli.