WGLab / doc-ANNOVAR

Documentation for the ANNOVAR software
http://annovar.openbioinformatics.org
224 stars 342 forks source link

Annotated VCF output botched for ChrM and ChrY #10

Closed sxv closed 8 years ago

sxv commented 8 years ago

Hi,

I've been using ANNOVAR with great results for a long time now, thank you for great work. I recently decided to switch from using the multianno.txt output format to the more cross-compatible multianno.vcf. However, I've run into a formatting issue in the output for lines beginning with "chrM" and "chrY".

See example below:

Input line: chrM 73 . G A 185.36 PASS AB=0;ABP=0;AC=1;AF=1;AN=1;AO=7;CIGAR=1X;DP=7;DPB=7;DPRA=0;EFF=INTERGENIC(MODIFIER||||||||||A);EPP=3.32051;EPPR=0;FS=0;GC=57;GTI=0;HRun=0;HaplotypeScore=0;LEN=1;MEANALT=1;MQ=60;MQ0=0;MQM=60;MQMR=0;NS=1;NUMALT=1;ODDS=42.6808;PAIRED=1;PAIREDR=0;PAO=0;PQA=0;PQR=0;PRO=0;QA=229;QD=26.48;QR=0;RO=0;RPL=6;RPP=10.7656;RPPR=0;RPR=1;RUN=1;SAF=4;SAP=3.32051;SAR=3;SRF=0;SRP=0;SRR=0;TYPE=snp;technology.illumina=1 GT:AO:DP:GQ:PL:QA:QR:RO 1:7:7:99:209,0:229:0:0

Incorrectly formatted output *multianno.vcf line: 185.36 PASS AB=0;ABP=0;AC=1;AF=1;AN=1;AO=7;CIGAR=1X;DP=7;DPB=7;DPRA=0;EFF=INTERGENIC(MODIFIER||||||||||A);EPP=3.32051;EPPR=0;FS=0;GC=57;GTI=0;HRun=0;HaplotypeScore=0;LEN=1;MEANALT=1;MQ=60;MQ0=0;MQM=60;MQMR=0;NS=1;NUMALT=1;ODDS=42.6808;PAIRED=1;PAIREDR=0;PAO=0;PQA=0;PQR=0;PRO=0;QA=229;QD=26.48;QR=0;RO=0;RPL=6;RPP=10.7656;RPPR=0;RPR=1;RUN=1;SAF=4;SAP=3.32051;SAR=3;SRF=0;SRP=0;SRR=0;TYPE=snp;technology.illumina=1 GT:AO:DP:GQ:PL:QA:QR:RO 1:7:7:99:209,0:229:0:0 ;ANNOVAR_DATE=2014-07-22;Func.refGene=;Gene.refGene=;GeneDetail.refGene=;ExonicFunc.refGene=;AAChange.refGene=;Func.ensGene=;Gene.ensGene=;GeneDetail.ensGene=;ExonicFunc.ensGene=;AAChange.ensGene=;clinvar_20150330=;PopFreqMax=;1000G_ALL=;1000G_AFR=;1000G_AMR=;1000G_EAS=;1000G_EUR=;1000G_SAS=;ExAC_ALL=;ExAC_AFR=;ExAC_AMR=;ExAC_EAS=;ExAC_FIN=;ExAC_NFE=;ExAC_OTH=;ExAC_SAS=;ESP6500siv2_ALL=;ESP6500siv2_AA=;ESP6500siv2_EA=;CG46=;cosmic70=;snp129=;snp132=;snp138=;avsift=;ALLELE_END

Note: the *multianno.txt file is also malformed but in a different way: chrM . 185.36 chrM 73 . G A 185.36 PASS AB=0;ABP=0;AC=1;AF=1;AN=1;AO=7;CIGAR=1X;DP=7;DPB=7;DPRA=0;EFF=INTERGENIC(MODIFIER||||||||||A);EPP=3.32051;EPPR=0;FS=0;GC=57;GTI=0;HRun=0;HaplotypeScore=0;LEN=1;MEANALT=1;MQ=60;MQ0=0;MQM=60;MQMR=0;NS=1;NUMALT=1;ODDS=42.6808;PAIRED=1;PAIREDR=0;PAO=0;PQA=0;PQR=0;PRO=0;QA=229;QD=26.48;QR=0;RO=0;RPL=6;RPP=10.7656;RPPR=0;RPR=1;RUN=1;SAF=4;SAP=3.32051;SAR=3;SRF=0;SRP=0;SRR=0;TYPE=snp;technology.illumina=1 GT:AO:DP:GQ:PL:QA:QR:RO 1:7:7:99:209,0:229:0:0

I am manually skipping these lines for now, but would be helpful to figure out the root of this problem. Let me know if any ideas, and I'll keep troubleshooting as well.

Thanks.

kaichop commented 8 years ago

Can you please provide the exact details, such as the command lines that you used? We need to reproduce the results using your command line first.

On Sun, Apr 3, 2016 at 1:26 AM, sujay notifications@github.com wrote:

Hi,

I've been using ANNOVAR with great results for a long time now, thank you for great work. I recently decided to switch from using the multianno.txt output format to the more cross-compatible multianno.vcf. However, I've run into a formatting issue in the output for lines beginning with "ChrM" and "ChrY".

See example below:

Input line: chrM 73 . G A 185.36 PASS AB=0;ABP=0;AC=1;AF=1;AN=1;AO=7;CIGAR=1X;DP=7;DPB=7;DPRA=0;EFF=INTERGENIC(MODIFIER||||||||||A);EPP=3.32051;EPPR=0;FS=0;GC=57;GTI=0;HRun=0;HaplotypeScore=0;LEN=1;MEANALT=1;MQ=60;MQ0=0;MQM=60;MQMR=0;NS=1;NUMALT=1;ODDS=42.6808;PAIRED=1;PAIREDR=0;PAO=0;PQA=0;PQR=0;PRO=0;QA=229;QD=26.48;QR=0;RO=0;RPL=6;RPP=10.7656;RPPR=0;RPR=1;RUN=1;SAF=4;SAP=3.32051;SAR=3;SRF=0;SRP=0;SRR=0;TYPE=snp;technology.illumina=1 GT:AO:DP:GQ:PL:QA:QR:RO 1:7:7:99:209,0:229:0:0

Incorrectly formatted output *multianno.vcf line: 185.36 PASS AB=0;ABP=0;AC=1;AF=1;AN=1;AO=7;CIGAR=1X;DP=7;DPB=7;DPRA=0;EFF=INTERGENIC(MODIFIER||||||||||A);EPP=3.32051;EPPR=0;FS=0;GC=57;GTI=0;HRun=0;HaplotypeScore=0;LEN=1;MEANALT=1;MQ=60;MQ0=0;MQM=60;MQMR=0;NS=1;NUMALT=1;ODDS=42.6808;PAIRED=1;PAIREDR=0;PAO=0;PQA=0;PQR=0;PRO=0;QA=229;QD=26.48;QR=0;RO=0;RPL=6;RPP=10.7656;RPPR=0;RPR=1;RUN=1;SAF=4;SAP=3.32051;SAR=3;SRF=0;SRP=0;SRR=0;TYPE=snp;technology.illumina=1 GT:AO:DP:GQ:PL:QA:QR:RO 1:7:7:99:209,0:229:0:0 ;ANNOVAR_DATE=2014-07-22;Func.refGene=;Gene.refGene=;GeneDetail.refGene=;ExonicFunc.refGene=;AAChange.refGene=;Func.ensGene=;Gene.ensGene=;GeneDetail.ensGene=;ExonicFunc.ensGene=;AAChange.ensGene=;clinvar_20150330=;PopFreqMax=;1000G_ALL=;1000G_AFR=;1000G_AMR=;1000G_EAS=;1000G_EUR=;1000G_SAS=;ExAC_ALL=;ExAC_AFR=;ExAC_AMR=;ExAC_EAS=;ExAC_FIN=;ExAC_NFE=;ExAC_OTH=;ExAC_SAS=;ESP6500siv2_ALL=;ESP6500siv2_AA=;ESP6500siv2_EA=;CG46=;cosmic70=;snp129=;snp132=;snp138=;avsift=;ALLELE_END

Note: the *multianno.txt file for this line contains the correct columns but adds extra tab-characters between some columns: chrM . 185.36 chrM 73 . G A 185.36 PASS AB=0;ABP=0;AC=1;AF=1;AN=1;AO=7;CIGAR=1X;DP=7;DPB=7;DPRA=0;EFF=INTERGENIC(MODIFIER||||||||||A);EPP=3.32051;EPPR=0;FS=0;GC=57;GTI=0;HRun=0;HaplotypeScore=0;LEN=1;MEANALT=1;MQ=60;MQ0=0;MQM=60;MQMR=0;NS=1;NUMALT=1;ODDS=42.6808;PAIRED=1;PAIREDR=0;PAO=0;PQA=0;PQR=0;PRO=0;QA=229;QD=26.48;QR=0;RO=0;RPL=6;RPP=10.7656;RPPR=0;RPR=1;RUN=1;SAF=4;SAP=3.32051;SAR=3;SRF=0;SRP=0;SRR=0;TYPE=snp;technology.illumina=1 GT:AO:DP:GQ:PL:QA:QR:RO 1:7:7:99:209,0:229:0:0

I am manually skipping these lines for now, but would be helpful to figure out the root of this problem. Let me know if any ideas, and I'll keep troubleshooting as well.

Thanks.

— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub https://github.com/WGLab/doc-ANNOVAR/issues/10

sxv commented 8 years ago

Cannot reproduce -- closing since this could have been a problem with the input file.