zhengxwen / SeqArray

Data management of large-scale whole-genome sequence variant calls (Development version only)
http://www.bioconductor.org/packages/SeqArray
43 stars 12 forks source link

error while loading vcf file #4

Closed chisqr closed 8 years ago

chisqr commented 9 years ago

Hello,

I used GATK to analyze a gene panel for several subjects. I used snpEff for annotation and then used GATK's variantannotator to make the final vcf file. Now when I am trying to make a GDS file out of it, I get the following error:

seqVCF2GDS(vcf.fname, gds.fname, verbose = T); The Variant Call Format (VCF) header: file format: VCFv4.1 the number of sets of chromosomes (ploidy): 2 Parsing "gatk_snpeff.vcf" ... Error: FILE: gatk_snpeff.vcf LINE: 8376, COLUMN: 8, AC=1;AF=2.451e-03;AN=408;BaseQRankSum=-6.976;ClippingRankSum=0.108;DP=18196;FS=4.116;InbreedingCoeff=-0.0025;MLEAC=1;MLEAF=2.451e-03;MQ=70.05;MQ0=0;MQRankSum=-0.208;QD=11.48;ReadPosRankSum=-1.759;SNPEFF_AMINO_ACID_CHANGE=p.Asp389Asp/c.1167C>T;SNPEFF_CODON_CHANGE=gaC/gaT;SNPEFF_EFFECT=SYNONYMOUS_CODING;SNPEFF_EXON_ID=12;SNPEFF_FUNCTIONAL_CLASS=SILENT;SNPEFF_GENE_BIOTYPE=protein_coding;SNPEFF_GENE_NAME=DIAPH1;SNPEFF_IMPACT=LOW;SNPEFF_TRANSCRIPT_ID=ENST00000253811;VQSLOD=0.987;culprit=FS;set=variant359 FILE: gatk_snpeff.vcf LINE: 8376, COLUMN: 8, AC=1;AF=2.451e-03;AN=408;BaseQRankSum=-6.976;ClippingRankSum=0.108;DP=18196;FS=4.116;InbreedingCoeff=-0.0025;MLEAC=1;MLEAF=2.451e-03;MQ=70.05;MQ0=0;MQRankSum=-0.208;QD=11.48;ReadPosRankSum=-1.759;SNPEFF_A

Can some one please point to what I am doing wrong here ? Thanks!

regards, Rahul

zhengxwen commented 9 years ago

Could you please show me the error message? like Invalid INFO Type., Unknown INFO ID: %s, should be defined ahead?

It is hard to identify the problem from your post. If you don't mind, please send me the VCF file or a part of it.

andykwok commented 8 years ago

same error, no any error message like 'Invalid INFO Type' ...

zhengxwen commented 8 years ago

Please send me a part of your VCF file if possible. Otherwise, please use the latest version.

andykwok commented 8 years ago

I already sent you an email to your gmail several days ago. can you check that ?

zhengxwen commented 8 years ago

I did not get your email, please resend it to zhengx@uw.edu

zhengxwen commented 8 years ago

Thanks for your test VCF file. There are duplicated INFO ID on line 405 in your file:

Warning messages:
1: LINE: 405, ignore duplicated INFO ID (ANNOVAR_DATE).
2: LINE: 405, ignore duplicated INFO ID (Func.refGene).
3: LINE: 405, ignore duplicated INFO ID (Gene.refGene).
4: LINE: 405, ignore duplicated INFO ID (GeneDetail.refGene).
5: LINE: 405, ignore duplicated INFO ID (ExonicFunc.refGene).
6: LINE: 405, ignore duplicated INFO ID (AAChange.refGene).

SeqArray_1.9.17 could solve this problem.

andykwok commented 8 years ago

thanks