hartwigmedical / hmftools

Various algorithms for analysing genomics data
GNU General Public License v3.0
181 stars 56 forks source link

gnomAD annotation - PAVE (1.4.2) #421

Closed anaduba closed 1 year ago

anaduba commented 1 year ago

Hello,

I am running PAVE (1.4.2) to annotate a VCF file of SNVs, all of them in chrX. I was trying to annotate also the population frequency using gnomAD but it's not working. When using the argument -gnomad_freq_file and the file gnomad_variants_chrX_v38.csv.gz (obtained from the hmf dna pipeline resources, hmf_dna_pipeline_resources.38_v5.32) the following error its shown:

Pave version: 1.4.2
Exception in thread "main" java.lang.NumberFormatException: For input string: "G"

Also, if I use the argument -gnomad_load_chr_on_demand with the same gnomad file, PAVE is not annotating gnomad frequencies, nor information in the VCF header regarding gnomad. So I'm not sure it's reading the file.

How should I run PAVE with gnomad files? Thank you

charlesshale commented 1 year ago

Unzip the files in that file: gnomad_variants_chrX_v38.csv.gz

Then call it with this config: -gnomad_freq_dir /path_to_gnomad_files/ -gnomad_load_chr_on_demand

I'll update the read-me to make that clearer.

anaduba commented 1 year ago

Solved! Thank you