andyrimmer / Platypus

Platypus Variant Caller
GNU General Public License v3.0
105 stars 38 forks source link

No genotype likelihood for multi-allelic loci #6

Open SiyangLiu opened 10 years ago

SiyangLiu commented 10 years ago

Hi Andy, Thanks for developing the cool platypus. I discover that there is no genotype likelihood for multi-allelic loci and this causes problem when applying beagle to phase the variants. Do you think not calculating genotype likehood for multi-allelic loci a bug or a particular design by platypus?

Best, Siyang

andyrimmer commented 10 years ago

Hi Siyang,

Thanks for reporting this as an issue on Github. It is not a bug, but was a design choice. However, it is something that I want to change.

Kind regards, Andy

On Wed, Jul 23, 2014 at 2:33 PM, Siyang notifications@github.com wrote:

Hi Andy, Thanks for developing the cool platypus. I discover that there is no genotype likelihood for multi-allelic loci and this causes problem when applying beagle to phase the variants. Do you think not calculating genotype likehood for multi-allelic loci a bug or a particular design by platypus?

Best, Siyang

— Reply to this email directly or view it on GitHub https://github.com/andyrimmer/Platypus/issues/6.

?

SiyangLiu commented 10 years ago

Thank you very much! I am considering if I should remove the multi-allelic loci before using beagle and move on. But in that case, it will remove some of the alleles. May I ask approximately when you will change it? Just to have a roughly idea to arrange work :)

Best, Siyang

From: andyrimmer Date: 2014-07-23 15:49 To: andyrimmer/Platypus CC: Siyang Subject: Re: [Platypus] No genotype likelihood for multi-allelic loci (#6) Hi Siyang,

Thanks for reporting this as an issue on Github. It is not a bug, but was a design choice. However, it is something that I want to change.

Kind regards, Andy

On Wed, Jul 23, 2014 at 2:33 PM, Siyang notifications@github.com wrote:

Hi Andy, Thanks for developing the cool platypus. I discover that there is no genotype likelihood for multi-allelic loci and this causes problem when applying beagle to phase the variants. Do you think not calculating genotype likehood for multi-allelic loci a bug or a particular design by platypus?

Best, Siyang

— Reply to this email directly or view it on GitHub https://github.com/andyrimmer/Platypus/issues/6.

? — Reply to this email directly or view it on GitHub.

andyrimmer commented 10 years ago

Hi Siyang,

It's a bit hard to estimate how long it will take, as I'm quite busy right now. Maybe a couple of weeks, if I have time to work on it. For now I would advise simply filtering those sites, as they should only be a small fraction of calls.

Kind regards, Andy

On Wed, Jul 23, 2014 at 2:54 PM, Siyang notifications@github.com wrote:

Thank you very much! I am considering if I should remove the multi-allelic loci before using beagle and move on. But in that case, it will remove some of the alleles. May I ask approximately when you will change it? Just to have a roughly idea to arrange work :)

Best, Siyang

From: andyrimmer Date: 2014-07-23 15:49 To: andyrimmer/Platypus CC: Siyang Subject: Re: [Platypus] No genotype likelihood for multi-allelic loci (#6) Hi Siyang,

Thanks for reporting this as an issue on Github. It is not a bug, but was a design choice. However, it is something that I want to change.

Kind regards, Andy

On Wed, Jul 23, 2014 at 2:33 PM, Siyang notifications@github.com wrote:

Hi Andy, Thanks for developing the cool platypus. I discover that there is no genotype likelihood for multi-allelic loci and this causes problem when applying beagle to phase the variants. Do you think not calculating genotype likehood for multi-allelic loci a bug or a particular design by platypus?

Best, Siyang

— Reply to this email directly or view it on GitHub https://github.com/andyrimmer/Platypus/issues/6.

? — Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub https://github.com/andyrimmer/Platypus/issues/6#issuecomment-49876401.

?

SiyangLiu commented 10 years ago

I see. Thank you very much!

Best, Siyang From: andyrimmer Date: 2014-07-23 16:07 To: andyrimmer/Platypus CC: Siyang Subject: Re: [Platypus] No genotype likelihood for multi-allelic loci (#6) Hi Siyang,

It's a bit hard to estimate how long it will take, as I'm quite busy right now. Maybe a couple of weeks, if I have time to work on it. For now I would advise simply filtering those sites, as they should only be a small fraction of calls.

Kind regards, Andy

On Wed, Jul 23, 2014 at 2:54 PM, Siyang notifications@github.com wrote:

Thank you very much! I am considering if I should remove the multi-allelic loci before using beagle and move on. But in that case, it will remove some of the alleles. May I ask approximately when you will change it? Just to have a roughly idea to arrange work :)

Best, Siyang

From: andyrimmer Date: 2014-07-23 15:49 To: andyrimmer/Platypus CC: Siyang Subject: Re: [Platypus] No genotype likelihood for multi-allelic loci (#6) Hi Siyang,

Thanks for reporting this as an issue on Github. It is not a bug, but was a design choice. However, it is something that I want to change.

Kind regards, Andy

On Wed, Jul 23, 2014 at 2:33 PM, Siyang notifications@github.com wrote:

Hi Andy, Thanks for developing the cool platypus. I discover that there is no genotype likelihood for multi-allelic loci and this causes problem when applying beagle to phase the variants. Do you think not calculating genotype likehood for multi-allelic loci a bug or a particular design by platypus?

Best, Siyang

— Reply to this email directly or view it on GitHub https://github.com/andyrimmer/Platypus/issues/6.

? — Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub https://github.com/andyrimmer/Platypus/issues/6#issuecomment-49876401.

? — Reply to this email directly or view it on GitHub.

SiyangLiu commented 10 years ago

Dear Andy, Have you fixed this problem? I notice that there are lots of multi-allelic loci in the vcf - ~700K passed variants. Do you have any idea of the main culprit of this?

Best, Siyang

andyrimmer commented 10 years ago

Hi Siyang,

I haven't done this yet. I'm hoping to have time either this week or next week. I'll keep you informed.

The most likely culprits for multi-allelic sites are either indels in repetitive regions (homopolymers or short tandem repeats), or multi-SNP events with several SNPs close together that give different haplotypes in different samples.

Kind regards, Andy

On Thu, Aug 14, 2014 at 6:19 PM, Siyang notifications@github.com wrote:

Dear Andy, Have you fixed this problem? I notice that there are lots of multi-allelic loci in the vcf - ~700K passed variants. Do you have any idea of the main culprit of this?

Best, Siyang

— Reply to this email directly or view it on GitHub https://github.com/andyrimmer/Platypus/issues/6#issuecomment-52213541.

?

SiyangLiu commented 10 years ago

Dear Andy, Could you please let me know when you have fixed this? In addition, previously you mentioned that the NR and NV in the Format field encode the exact allele in the read alignment. I think this is why we observe this: GT:GL:GOF:GQ:NR:NV 1/1:-1.43,-0.42,0.0:0:5:1:0 Although NV is 0, the genotype is 1/1.

Could you please output the number of reads that are realigned to reference allele and the alternative allele in the NR and NV?

Thank you very much in advance!

Best, Siyang

From: andyrimmer Date: 2014-08-18 12:21 To: andyrimmer/Platypus CC: Siyang Subject: Re: [Platypus] No genotype likelihood for multi-allelic loci (#6) Hi Siyang,

I haven't done this yet. I'm hoping to have time either this week or next week. I'll keep you informed.

The most likely culprits for multi-allelic sites are either indels in repetitive regions (homopolymers or short tandem repeats), or multi-SNP events with several SNPs close together that give different haplotypes in different samples.

Kind regards, Andy

On Thu, Aug 14, 2014 at 6:19 PM, Siyang notifications@github.com wrote:

Dear Andy, Have you fixed this problem? I notice that there are lots of multi-allelic loci in the vcf - ~700K passed variants. Do you have any idea of the main culprit of this?

Best, Siyang

— Reply to this email directly or view it on GitHub https://github.com/andyrimmer/Platypus/issues/6#issuecomment-52213541.

? — Reply to this email directly or view it on GitHub.

andyrimmer commented 10 years ago

Hi Siyang,

Sorry for the slow reply. I'm sorry but I'm not going to have time to do this anytime soon. I'm restricting work on Platypus to bugfixes and simple changes at the moment, as I am too busy for anything else.

Kind regards, Andy

On Fri, Aug 29, 2014 at 6:16 AM, Siyang notifications@github.com wrote:

Dear Andy, Could you please let me know when you have fixed this? In addition, previously you mentioned that the NR and NV in the Format field encode the exact allele in the read alignment. I think this is why we observe this: GT:GL:GOF:GQ:NR:NV 1/1:-1.43,-0.42,0.0:0:5:1:0 Although NV is 0, the genotype is 1/1.

Could you please output the number of reads that are realigned to reference allele and the alternative allele in the NR and NV?

Thank you very much in advance!

Best, Siyang

From: andyrimmer Date: 2014-08-18 12:21 To: andyrimmer/Platypus CC: Siyang Subject: Re: [Platypus] No genotype likelihood for multi-allelic loci (#6) Hi Siyang,

I haven't done this yet. I'm hoping to have time either this week or next week. I'll keep you informed.

The most likely culprits for multi-allelic sites are either indels in repetitive regions (homopolymers or short tandem repeats), or multi-SNP events with several SNPs close together that give different haplotypes in different samples.

Kind regards, Andy

On Thu, Aug 14, 2014 at 6:19 PM, Siyang notifications@github.com wrote:

Dear Andy, Have you fixed this problem? I notice that there are lots of multi-allelic loci in the vcf - ~700K passed variants. Do you have any idea of the main culprit of this?

Best, Siyang

— Reply to this email directly or view it on GitHub https://github.com/andyrimmer/Platypus/issues/6#issuecomment-52213541.

? — Reply to this email directly or view it on GitHub.

— Reply to this email directly or view it on GitHub https://github.com/andyrimmer/Platypus/issues/6#issuecomment-53838275.

?