compbiocore / VariantVisualization.jl

Julia package powering VIVA, our tool for visualization of genomic variation data. Manual:
https://compbiocore.github.io/VariantVisualization.jl/stable/
Other
82 stars 13 forks source link

LoadError: GeneticVariation.VCF.Reader file format #98

Open nmhawkey opened 3 years ago

nmhawkey commented 3 years ago

I place the following command:

viva -f result2.vcf -o output/directory/

Then I receive this error: Welcome to VIVA.

Loading dependency packages:

┌ Warning: ORCA.jl has been deprecated and all savefig functionality │ has been implemented directly in PlotlyBase itself. │ │ By implementing in PlotlyBase.jl, the savefig routines are automatically │ available to PlotlyJS.jl also. └ @ ORCA ~/.julia/packages/ORCA/U5XaN/src/ORCA.jl:8 ...

Finished loading packages!

Reading result2.vcf ...

No filters applied. Large vcf files will take a long time to process and heatmap visualizations will lose resolution at this scale unless viewed in interactive html for zooming.

Loading VCF file into memory for visualization ERROR: LoadError: GeneticVariation.VCF.Reader file format error on line 51 ~>"++_NR=.;" Stacktrace: [1] error(::Type{T} where T, ::String, ::Int64, ::String, ::String) at ./error.jl:42 [2] _read!(::GeneticVariation.VCF.Reader, ::BioCore.Ragel.State{BufferedStreams.BufferedInputStream{IOStream}}, ::GeneticVariation.VCF.Record) at /Users/nathan/.julia/packages/BioCore/YBJvb/src/ReaderHelper.jl:164 [3] read!(::GeneticVariation.VCF.Reader, ::GeneticVariation.VCF.Record) at /Users/nathan/.julia/packages/BioCore/YBJvb/src/ReaderHelper.jl:134 [4] tryread!(::GeneticVariation.VCF.Reader, ::GeneticVariation.VCF.Record) at /Users/nathan/.julia/packages/BioCore/YBJvb/src/IO.jl:73 [5] iterate at /Users/nathan/.julia/packages/BioCore/YBJvb/src/IO.jl:84 [inlined] (repeats 2 times) [6] top-level scope at /usr/local/bin/viva:237 [7] include(::Function, ::Module, ::String) at ./Base.jl:380 [8] include(::Module, ::String) at ./Base.jl:368 [9] exec_options(::Base.JLOptions) at ./client.jl:296 [10] _start() at ./client.jl:506 in expression starting at /usr/local/bin/viva:229

Line 51: chr1 874570 . C A 209 Pass AF1=0.1339286;ALLELE_ORIGIN=@;AN=2;CLINICAL_SIGNIFICANCE=@;Category=2;DP=210;DP4=38,59,4,9;EFF=INTRON(MODIFIER||||SAMD11|mRNA|CODING|NM_152486|);GLOBAL_MAF=@;GTS=A/C,A/C;Gene_Description=879961;HGVS=(NC_000001.10:g.874570C>A,NM_152486.2:c.520+61C>A,);MQ=40;Observation=AMBIGUOUS;PUBMED_CITATIONS=0;SEL_PRIMARY_EFF=0;STDP4=38,59,4,9;Zygosity=LowAF;dbnsfp1000Gp1_AF=.;dbnsfp1000Gp1_AFR_AF=.;dbnsfp1000Gp1_AMR_AF=.;dbnsfp1000Gp1_ASN_AF=.;dbnsfp1000Gp1_EUR_AF=.;dbnsfp29way_logOdds=.;dbnsfpESP5400_AA_AF=.;dbnsfpEnsembl_transcriptid=.;dbnsfpGERP++_NR=.;dbnsfpGERP++_RS=.;dbnsfpInterpro_domain=.;dbnsfpSIFT_score=.;dbnsfpUniprot_acc=.;pValue=1.0E-209 GT:GQ:PL 1/1:10:0,255,255 1/1:10:0,255,255

gtollefson commented 3 years ago

Hi @nmhawkey,

Thanks for bringing this issue up. It looks like there is unexpected formatting (probably an unexpected special symbol like '+') on that line. The fastest way to get around this is to remove the offending line and produce a new "cleaned" file to visualize with a command like the GNU sed command below:

sed -i '51d' result2.vcf > result2.vcf_cleaned.vcf

This looks like an issue with the GeneticVariation.jl package which VIVA depends upon for reading in the VCF file. I would make an issue with them to correct this if you cannot run it with the cleaned version of the file with the offending line removed.

Let me know how it goes and if you have any more issues!

George