compbiocore / VariantVisualization.jl

Julia package powering VIVA, our tool for visualization of genomic variation data. Manual:
https://compbiocore.github.io/VariantVisualization.jl/stable/
Other
82 stars 13 forks source link

ERROR: LoadError: GeneticVariation.VCF.Reader #103

Open ionicbond2005 opened 3 years ago

ionicbond2005 commented 3 years ago

Dear Sir/ Madam:

I was trying to run viva to visualize the read depth for certain number of variants which have a read depth over 1000 to see whether those variants are clustered or not. However, when I run the program using the command-line

viva -f <> -m read_depth -o <> -t Read-depth-heat-map-for-bump --save_remotely

I got the error message below:

[ Info: This will take a few moments... ERROR: LoadError: GeneticVariation.VCF.Reader file format error on line 37 Stacktrace: [1] error(::String, ::Int64) at ./error.jl:42 [2] _readheader!(::GeneticVariation.VCF.Reader, ::BioCore.Ragel.State{BufferedStreams.BufferedInputStream{IOStream}}) at /sysapps/cluster/software/Julia/1.5.3-linux-x86_64/local/share/julia/packages/BioCore/YBJvb/src/ReaderHelper.jl:106 [3] readheader!(::GeneticVariation.VCF.Reader) at /sysapps/cluster/software/Julia/1.5.3-linux-x86_64/local/share/julia/packages/BioCore/YBJvb/src/ReaderHelper.jl:80 [4] Reader at /sysapps/cluster/software/Julia/1.5.3-linux-x86_64/local/share/julia/packages/GeneticVariation/r8DAL/src/vcf/reader.jl:15 [inlined] [5] GeneticVariation.VCF.Reader(::IOStream) at /sysapps/cluster/software/Julia/1.5.3-linux-x86_64/local/share/julia/packages/GeneticVariation/r8DAL/src/vcf/reader.jl:28 [6] top-level scope at /sysapps/cluster/software/Julia/1.5.3-linux-x86_64/local/share/julia/packages/VariantVisualization/1yoNl/viva:130 [7] include(::Function, ::Module, ::String) at ./Base.jl:380 [8] include(::Module, ::String) at ./Base.jl:368 [9] exec_options(::Base.JLOptions) at ./client.jl:296 [10] _start() at ./client.jl:506 in expression starting at /sysapps/cluster/software/Julia/1.5.3-linux-x86_64/local/share/julia/packages/VariantVisualization/1yoNl/viva:130

Welcome to VIVA.

Loading dependency packages:

...

Finished loading packages!

Reading /hpcdata/pid/list/MS_Mike_Lenardo/2019-05-30_Novogene/VariantCallingGATK/test-selected.vcf ...

===============================================================

As its say file format error on line 37, so I pull out the line 36-37 in the vcf

head -37 /hpcdata/pid/list/MS_Mike_Lenardo/2019-05-30_Novogene/VariantCallingGATK/test-selected.vcf |tail -2

GATKCommandLine=<ID=VariantFiltration,CommandLine="VariantFiltration --output /hpcdata/pid/list/MS_Mike_Lenardo/2019-05-30_Novogene/VariantCallingGATK/test-variant-filtered.vcf.gz --filter-expression DP < 5000 --filter-expression DP > 20000 --filter-name Not_highest_DP --filter-name Not_lowest_DP --variant /hpcdata/pid/list/MS_Mike_Lenardo/2019-05-30_Novogene/VariantCallingGATK/Novogene/Novogene.vqsr.seqr.vcf.gz --reference /hpcdata/pid/andrew/data/human_g1k_v37.fasta --cluster-size 3 --cluster-window-size 0 --mask-extension 0 --mask-name Mask --filter-not-in-mask false --missing-values-evaluate-as-failing false --invalidate-previous-filters false --invert-filter-expression false --invert-genotype-filter-expression false --set-filtered-genotype-to-no-call false --apply-allele-specific-filters false --interval-set-rule UNION --interval-padding 0 --interval-exclusion-padding 0 --interval-merging-rule ALL --read-validation-stringency SILENT --seconds-between-progress-updates 10.0 --disable-sequence-dictionary-validation false --create-output-bam-index true --create-output-bam-md5 false --create-output-variant-index true --create-output-variant-md5 false --lenient false --add-output-sam-program-record true --add-output-vcf-command-line true --cloud-prefetch-buffer 40 --cloud-index-prefetch-buffer -1 --disable-bam-index-caching false --sites-only-vcf-output false --help false --version false --showHidden false --verbosity INFO --QUIET false --use-jdk-deflater false --use-jdk-inflater false --gcs-max-retries 20 --gcs-project-for-requester-pays --disable-tool-default-read-filters false",Version="4.2.0.0",Date="April 29, 2021 4:15:29 PM EDT">

GVCFBlock0-1=minGQ=0(inclusive),maxGQ=1(exclusive)

=========================================== Any idea on why the error occurs?

Thank you very much for your help

Samuel Li

gtollefson commented 3 years ago

@ionicbond2005 Hi Samuel, Thanks for bringing this issue up. It looks like there is unexpected formatting (probably an unexpected special symbol) on that line. The fastest way to get around this is to remove the line and produce a new "cleaned" file to visualize with a command like the GNU sed command below:

sed -i '37d' /hpcdata/pid/list/MS_Mike_Lenardo/2019-05-30_Novogene/VariantCallingGATK/test-selected.vcf > /hpcdata/pid/list/MS_Mike_Lenardo/2019-05-30_Novogene/VariantCallingGATK/test-selected_cleaned.vcf

This looks like an issue with the GeneticVariation.jl package which VIVA depends upon for reading in the VCF file. I would make an issue with them to correct this if you cannot run it with the cleaned version of the file with the offending line removed.

Let me know how it goes and if you have any more issues!

George

ionicbond2005 commented 3 years ago

Dear George:

After removing the ##GVCFBlock, I rerun the program and have another problem occur:

LoadError: BoundsError: attempt to access 1-element Array{SubString{String},1} at index [2]

Any idea on how to fix this problem? Thank you very much

Samuel Li