brentp / vcfanno

annotate a VCF with other VCFs/BEDs/tabixed files
https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-0973-5
MIT License
364 stars 56 forks source link

Cannot annotate on gnomad 2.1 but working with gnomad 2.0 #102

Closed cvlvxi closed 5 years ago

cvlvxi commented 5 years ago

I was just curious whether you have tried annotating against gnomad 2.1 with success?

I've checked that the location of the variants in my vcf exist in both gnomad versions.. does vcfanno match on chr + position only or does it take into account the ID field (as this is what I can see does differ from 2.0 vs 2.1 gnomad vcfs)?

I tried replacing the ID field if it had an rsid -> . in gnomad 2.1 to see if there would be any differences but was unable to succeed in annotating my vcf..

See below the difference in results I get when using gnomad 2.0 vs 2.1.

vcfanno.config.old

[[annotation]]
file="gnomad.genomes.r2.0.1.sites.full.vcf.gz"
fields = ["AF_POPMAX", "AC_raw"]
names = ["AF_POPMAX", "AC_raw"]
ops = ["first", "first"]
./vcfanno -p 2 vcfanno.config.old fewvariants.vcf

#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  180716_K00164_0158_ML182323_GM12878_MAN-20180711_NEXTERAWGS
chr1    10146   .   AC  A   272.7   .   AC=1;AF=0.500;AN=2;DP=35;FS=4.465;MQ=32.92;MQRankSum=1.200;QD=7.79;ReadPosRankSum=1.083;SOR=0.177;FractionInformativeReads=0.686    GT:AD:DP:GQ:PL:SB   0/1:10,14:24:99:310,0,142:9,1,10,4
chr1    10354   .   C   A   45.9    .   AC=1;AF=0.500;AN=2;DP=17;FS=0.000;MQ=31.68;MQRankSum=0.727;QD=2.70;ReadPosRankSum=0.727;SOR=0.446;FractionInformativeReads=0.235;AF_POPMAX=0.1932;AC_raw=410    GT:AD:DP:GQ:PL:SB   0/1:1,3:4:16:74,0,16:0,1,1,2
chr1    10492   .   C   T   66.8    .   AC=1;AF=0.500;AN=2;DP=34;FS=5.184;MQ=54.48;MQRankSum=-0.905;QD=1.96;ReadPosRankSum=-0.780;SOR=0.116;FractionInformativeReads=0.794;AF_POPMAX=0.264;AC_raw=3568  GT:AD:DP:GQ:PL:SB   0/1:22,5:27:95:95,0,780:4,18,2,3
chr1    10616   .   CCGCCGTTGCAAAGGCGCGCCG  C   369.7   .   AC=2;AF=1.000;AN=2;DP=12;FS=0.000;MQ=42.18;QD=30.81;SOR=4.615;FractionInformativeReads=0.750;AF_POPMAX=0.9597;AC_raw=17596  GT:AD:DP:GQ:PL:SB   1/1:0,9:9:28:407,28,0:0,0,0,9
chr1    13896   .   C   A   50.8    .   AC=1;AF=0.500;AN=2;DP=17;FS=0.000;MQ=23.15;MQRankSum=-0.264;QD=2.99;ReadPosRankSum=0.791;SOR=0.693;FractionInformativeReads=1.000;AF_POPMAX=0.245;AC_raw=5993   GT:AD:DP:GQ:PL:SB   0/1:12,5:17:79:79,0,262:3,9,1,4
chr1    14677   .   G   A   56.0    .   AC=2;AF=1.000;AN=2;DP=4;FS=0.000;MQ=26.54;QD=14.01;SOR=0.693;FractionInformativeReads=1.000;AF_POPMAX=0.0574;AC_raw=1920    GT:AD:DP:GQ:PL:SB   1/1:0,4:4:12:84,12,0:0,0,2,2
chr1    14907   .   A   G   2280.8  .   AC=2;AF=1.000;AN=2;DP=75;FS=0.000;MQ=32.65;MQRankSum=0.702;QD=30.41;ReadPosRankSum=1.124;SOR=2.190;FractionInformativeReads=0.987;AF_POPMAX=0.5176;AC_raw=15324 GT:AD:DP:GQ:PL:SB   1/1:1,73:74:99:2309,193,0:0,1,55,18
chr1    14930   .   A   G   2465.8  .   AC=2;AF=1.000;AN=2;DP=82;FS=0.000;MQ=33.02;QD=30.07;SOR=1.824;FractionInformativeReads=0.976;AF_POPMAX=0.5272;AC_raw=15274  GT:AD:DP:GQ:PL:SB   1/1:0,80:80:99:2494,240,0:0,0,56,24
chr1    15118   .   A   G   751.8   .   AC=1;AF=0.500;AN=2;DP=39;FS=0.000;MQ=28.49;MQRankSum=-1.318;QD=19.28;ReadPosRankSum=0.452;SOR=0.653;FractionInformativeReads=0.974;AF_POPMAX=0.4873;AC_raw=13595    GT:AD:DP:GQ:PL:SB   0/1:7,31:38:99:780,0,103:3,4,14,17
chr1    15211   .   T   G   2304.8  .   AC=2;AF=1.000;AN=2;DP=75;FS=0.000;MQ=28.14;QD=30.73;SOR=0.774;FractionInformativeReads=1.000;AF_POPMAX=0.7133,0;AC_raw=19044,1  GT:AD:DP:GQ:PL:SB   1/1:0,75:75:99:2333,225,0:0,0,39,36
chr1    15274   .   A   T   1054.8  .   AC=2;AF=1.000;AN=2;DP=40;FS=0.000;MQ=25.77;QD=26.37;SOR=0.693;FractionInformativeReads=1.000;AF_POPMAX=0.7184,0.5234,0.00010764;AC_raw=18483,12003,1    GT:AD:DP:GQ:PL:SB   1/1:0,40:40:99:1083,119,0:0,0,20,20
chr1    16068   .   T   C   85.0    .   AC=2;AF=1.000;AN=2;DP=4;FS=0.000;MQ=23.55;QD=21.26;SOR=3.258;FractionInformativeReads=1.000;AF_POPMAX=0.5654;AC_raw=14153   GT:AD:DP:GQ:PL:SB   1/1:0,4:4:12:113,12,0:0,0,0,4

Versus running it with a later version of

vcfanno.config

[[annotation]]
file="../assets/gnomad.genomes.r2.1.sites.vcf.bgz"
fields = ["AF_popmax", "AC_raw"]
names = ["AF_popmax", "AC_raw"]
ops = ["first", "first"]
./vcfanno -p 2 vcfanno.config fewvariants.vcf

#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  180716_K00164_0158_ML182323_GM12878_MAN-20180711_NEXTERAWGS
chr1    10146   .   AC  A   272.7   .   AC=1;AF=0.500;AN=2;DP=35;FS=4.465;MQ=32.92;MQRankSum=1.200;QD=7.79;ReadPosRankSum=1.083;SOR=0.177;FractionInformativeReads=0.686    GT:AD:DP:GQ:PL:SB   0/1:10,14:24:99:310,0,142:9,1,10,4
chr1    10354   .   C   A   45.9    .   AC=1;AF=0.500;AN=2;DP=17;FS=0.000;MQ=31.68;MQRankSum=0.727;QD=2.70;ReadPosRankSum=0.727;SOR=0.446;FractionInformativeReads=0.235    GT:AD:DP:GQ:PL:SB   0/1:1,3:4:16:74,0,16:0,1,1,2
chr1    10492   .   C   T   66.8    .   AC=1;AF=0.500;AN=2;DP=34;FS=5.184;MQ=54.48;MQRankSum=-0.905;QD=1.96;ReadPosRankSum=-0.780;SOR=0.116;FractionInformativeReads=0.794  GT:AD:DP:GQ:PL:SB   0/1:22,5:27:95:95,0,780:4,18,2,3
chr1    10616   .   CCGCCGTTGCAAAGGCGCGCCG  C   369.7   .   AC=2;AF=1.000;AN=2;DP=12;FS=0.000;MQ=42.18;QD=30.81;SOR=4.615;FractionInformativeReads=0.750    GT:AD:DP:GQ:PL:SB   1/1:0,9:9:28:407,28,0:0,0,0,9
chr1    13896   .   C   A   50.8    .   AC=1;AF=0.500;AN=2;DP=17;FS=0.000;MQ=23.15;MQRankSum=-0.264;QD=2.99;ReadPosRankSum=0.791;SOR=0.693;FractionInformativeReads=1.000   GT:AD:DP:GQ:PL:SB   0/1:12,5:17:79:79,0,262:3,9,1,4
chr1    14677   .   G   A   56.0    .   AC=2;AF=1.000;AN=2;DP=4;FS=0.000;MQ=26.54;QD=14.01;SOR=0.693;FractionInformativeReads=1.000 GT:AD:DP:GQ:PL:SB   1/1:0,4:4:12:84,12,0:0,0,2,2
chr1    14907   .   A   G   2280.8  .   AC=2;AF=1.000;AN=2;DP=75;FS=0.000;MQ=32.65;MQRankSum=0.702;QD=30.41;ReadPosRankSum=1.124;SOR=2.190;FractionInformativeReads=0.987   GT:AD:DP:GQ:PL:SB   1/1:1,73:74:99:2309,193,0:0,1,55,18
chr1    14930   .   A   G   2465.8  .   AC=2;AF=1.000;AN=2;DP=82;FS=0.000;MQ=33.02;QD=30.07;SOR=1.824;FractionInformativeReads=0.976    GT:AD:DP:GQ:PL:SB   1/1:0,80:80:99:2494,240,0:0,0,56,24
chr1    15118   .   A   G   751.8   .   AC=1;AF=0.500;AN=2;DP=39;FS=0.000;MQ=28.49;MQRankSum=-1.318;QD=19.28;ReadPosRankSum=0.452;SOR=0.653;FractionInformativeReads=0.974  GT:AD:DP:GQ:PL:SB   0/1:7,31:38:99:780,0,103:3,4,14,17
chr1    15211   .   T   G   2304.8  .   AC=2;AF=1.000;AN=2;DP=75;FS=0.000;MQ=28.14;QD=30.73;SOR=0.774;FractionInformativeReads=1.000    GT:AD:DP:GQ:PL:SB   1/1:0,75:75:99:2333,225,0:0,0,39,36
chr1    15274   .   A   T   1054.8  .   AC=2;AF=1.000;AN=2;DP=40;FS=0.000;MQ=25.77;QD=26.37;SOR=0.693;FractionInformativeReads=1.000    GT:AD:DP:GQ:PL:SB   1/1:0,40:40:99:1083,119,0:0,0,20,20
chr1    16068   .   T   C   85.0    .   AC=2;AF=1.000;AN=2;DP=4;FS=0.000;MQ=23.55;QD=21.26;SOR=3.258;FractionInformativeReads=1.000 GT:AD:DP:GQ:PL:SB   1/1:0,4:4:12:113,12,0:0,0,0,4
brentp commented 5 years ago

if you change the extension .bgz to .gz then it should work. I should add a check for this, but haven't done so.

cvlvxi commented 5 years ago

Yep that worked thanks.

brentp commented 5 years ago

I have made a fix for this that will be out in next release.