atks / vt

A tool set for short variant discovery in genetic sequence data.
http://genome.sph.umich.edu/wiki/vt
MIT License
190 stars 3 forks source link

vt partition reporting no overlap #15

Closed cboustred closed 9 years ago

cboustred commented 9 years ago

Hi,

I'd like to use vt partition to identify the similarities and differences between variants called by different variant callers on the same set of samples.

However I'm having some issues...

partition v0.5 Options: input VCF file a platypus_chr22_normal_filt.vcf.gz input VCF file b freebayes_chr22_normal_filt.vcf.gz A: 663 variants B: 609 variants ts/tv ins/del A-B 663 [3.04] [1.20] A&B 0 [-nan] [-nan] B-A 609 [3.12] [1.50] of A 0.0% of B 0.0% Time elapsed: 0.02s

i.e. there are zero overlaps between variants in the two files

However, using vcftools I can see there are ~565 variant overlaps between variants in the two files:

vcf-isec -f platypus_chr22_normal_filt.vcf.gz freebayes_chr22_normal_filt.vcf.gz | grep -v ^# | wc -l 565

Any thoughts as to why vt partition would not be working? It appears to be an issue with the Freebayes VCF as when I do a comparison with a VCF generated with VarScan2 I get results as expected from vt parition:

vt partition varscan_chr22_normal.vcf platypus_chr22_normal_filt.vcf.gz

partition v0.5 Options: input VCF file a varscan_chr22_normal.vcf input VCF file b platypus_chr22_normal_filt.vcf.gz A: 673 variants B: 663 variants ts/tv ins/del A-B 97 [2.36] [1.00] A&B 576 [2.93] [1.50] B-A 87 [3.73] [0.75] of A 85.6% of B 86.9% Time elapsed: 0.02s

Using vt partition + varscan vcf + freebayes vcf returns 0 overlap hence why I think it is an issue with the freebayes VCF.

I have run vt partition comparing the freebayes vcf to itself and that returns as expected results: vt partition freebayes_chr22_normal_filt.vcf.gz freebayes_chr22_normal_filt.vcf.gz partition v0.5

Options: input VCF file a freebayes_chr22_normal_filt.vcf.gz input VCF file b freebayes_chr22_normal_filt.vcf.gz A: 609 variants B: 609 variants ts/tv ins/del A-B 0 [-nan] [-nan] A&B 609 [3.12] [1.50] B-A 0 [-nan] [-nan] of A 100.0% of B 100.0% Time elapsed: 0.03s

Commands for generating the Freebayes VCF were this:

Any help with this much appreciated

BW

Chris

cboustred commented 9 years ago

bcftools isec platypus_chr22_normal_filt.vcf.gz freebayes_chr22_normal_filt.vcf.gz -p dir

Also works as expected reporting ~561 shared variants using the freebayes vcf

atks commented 9 years ago

The contigs in the headers of the vcf files have to be the same for vt partition to work. Are the headers the same?

cboustred commented 9 years ago

Bingo, that did the trick! Thanks very much! Chris

atks commented 9 years ago

Thanks for reporting this, I'll add a check for that later.

atks commented 9 years ago

fixed.