brentp / echtvar

using all the bits for echt rapid variant annotation and filtering
https://academic.oup.com/nar/advance-article/doi/10.1093/nar/gkac931/6775383
MIT License
143 stars 10 forks source link

Missing Number= with gnomad.v3.1.2.echtvar.v2.zip #43

Closed fellen31 closed 4 days ago

fellen31 commented 2 months ago

Hi,

Thanks for providing gnomad.v3.1.2.echtvar.v2.zip. However, I'm having trouble with some downstream tools that complain about the INFO line of the annotated VCF, where there's no number in the INFO lines.

Command run:

echtvar anno -e gnomad.v3.1.2.echtvar.v2.zip -e CoLoRSdb.GRCh38.v1.0.0.deepvariant.glnexus.zip 7.bed.bcf 7.bed_echtvar_anno.bcf.gz

Missing Number=number

##INFO=<ID=gnomad_ac,Number=,Type=Integer,Description="added by echtvar from gnomad.v3.1.2.echtvar.v2.zip">
##INFO=<ID=gnomad_an,Number=,Type=Integer,Description="added by echtvar from gnomad.v3.1.2.echtvar.v2.zip">
##INFO=<ID=gnomad_nhomalt,Number=,Type=Integer,Description="added by echtvar from gnomad.v3.1.2.echtvar.v2.zip">
##INFO=<ID=gnomad_af,Number=,Type=Float,Description="added by echtvar from gnomad.v3.1.2.echtvar.v2.zip">
##INFO=<ID=gnomad_popmax_ac,Number=,Type=Integer,Description="added by echtvar from gnomad.v3.1.2.echtvar.v2.zip">
##INFO=<ID=gnomad_popmax_an,Number=,Type=Integer,Description="added by echtvar from gnomad.v3.1.2.echtvar.v2.zip">
##INFO=<ID=gnomad_popmax_nhomalt,Number=,Type=Integer,Description="added by echtvar from gnomad.v3.1.2.echtvar.v2.zip">
##INFO=<ID=gnomad_popmax_af,Number=,Type=Float,Description="added by echtvar from gnomad.v3.1.2.echtvar.v2.zip">
##INFO=<ID=gnomad_controls_and_biobanks_af,Number=,Type=Float,Description="added by echtvar from gnomad.v3.1.2.echtvar.v2.zip">
##INFO=<ID=gnomad_controls_and_biobanks_nhomalt,Number=,Type=Integer,Description="added by echtvar from gnomad.v3.1.2.echtvar.v2.zip">
##INFO=<ID=gnomad_filter,Number=,Type=String,Description="added by echtvar from gnomad.v3.1.2.echtvar.v2.zip">

This is not the case for a database I have encoded myself:

##INFO=<ID=colorsdb_af,Number=1,Type=Float,Description="added by echtvar Allele Frequency estimate for each alternate allele">
##INFO=<ID=colorsdb_ac,Number=1,Type=Integer,Description="added by echtvar Allele count in genotypes">

Any idea what is up with gnomad.v3.1.2.echtvar.v2.zip?

PS, Seems to have sneaked in a None here as well:

##echtvar_annoCommand=anno -i None 20.bed.bcf 20.bed_echtvar_anno.bcf.gz -e "gnomad.v3.1.2.echtvar.v2.zip -e CoLoRSdb.GRCh38.v1.0.0.deepvariant.glnexus.zip"
brentp commented 2 weeks ago

hi, sorry for the delay, can you show the json file that you used to create the problematic zip file?

fellen31 commented 2 weeks ago

Hi, no worries. The problematic zip file seems to be the pre-made file for hg38 provided with the releases.

I could remake the file using your json (https://github.com/brentp/echtvar/blob/main/examples/gnomad.v3.1.2.json) and use the resulting file without problems. Is the file provided under releases perhaps using an older version of echtvar?

brentp commented 1 week ago

hi, my reply seems to have been lost... so you're saying it works if you remake with a recent version of echtvar? If so, I'll see if I can get a new version of the zips uploaded somewhere.

fellen31 commented 1 week ago

hi, my reply seems to have been lost... so you're saying it works if you remake with a recent version of echtvar? If so, I'll see if I can get a new version of the zips uploaded somewhere.

Yes! It seems like the later version of echtvar produces a valid VCF. Would be great!

sitems commented 1 week ago

The same problem here. echtvar=0.2.0 with gnomad.v3.1.2.echtvar.v2.zip that you provided create some problematic lines in vcf header what causes downstream applications to crash.

INFO=

fellen31 commented 3 days ago

Thanks! 🙌