cgpu / vcf2maf-nf

Single process nextflow component for converting VCF to MAF with vcf2maf.pl
1 stars 2 forks source link

Resolve issue with hg19 #1

Open cgpu opened 5 years ago

cgpu commented 5 years ago

See here: https://github.com/mskcc/vcf2maf/issues/95

Currently if using hg19 ref fasta, it complains for the VEP cache vs fasta discrepancy

image

ERROR ~ Error executing process > 'Vcf2maf (H46126_CIN3_vs_H06530_Normal.vcf)'

Caused by:
  Process `Vcf2maf (H46126_CIN3_vs_H06530_Normal.vcf)` terminated with an error exit status (2)

Command executed:

  perl /opt/vcf2maf/vcf2maf.pl     --input-vcf H46126_CIN3_vs_H06530_Normal.vcf     --output-maf maf      --tumor-id H46126     --normal-id H06530     --ref-fasta  hg19_IonTorrent.fasta     --ncbi-build hg19     --filter-vcf /vepdata/ExAC_nonTCGA.r0.3.1.sites.vep.vcf.gz     --vep-path /opt/variant_effect_predictor_89/ensembl-tools-release-89/scripts/variant_effect_predictor     --vep-data /vepdata/     --vep-forks 2     --buffer-size 200     --species homo_sapiens         --cache-version 89

Command exit status:
  2

Command output:
  (empty)

Command error:
  [fai_load] build FASTA index.
  STATUS: Running VEP and writing to: ./H46126_CIN3_vs_H06530_Normal.vep.vcf
  ERROR: Cache assembly version (GRCh37) and database or selected assembly version (hg19) do not match

  If using human GRCh37 add "--port 3337" to use the GRCh37 database, or --offline to avoid database connection entirely

  ERROR: Failed to run the VEP annotator! Command: /usr/bin/perl /opt/variant_effect_predictor_89/ensembl-tools-release-89/scripts/variant_effect_predictor/variant_effect_predictor.pl --species homo_sapiens --assembly hg19 --offline --no_progress --no_stats --buffer_size 200 --sift b --ccds --uniprot --hgvs --symbol --numbers --domains --gene_phenotype --canonical --protein --biotype --uniprot --tsl --pubmed --variant_class --shift_hgvs 1 --check_existing --total_length --allele_number --no_escape --xref_refseq --failed 1 --vcf --minimal --flag_pick_allele --pick_order canonical,tsl,biotype,rank,ccds,length --dir /vepdata/ --fasta hg19_IonTorrent.fasta --format vcf --input_file H46126_CIN3_vs_H06530_Normal.vcf --output_file ./H46126_CIN3_vs_H06530_Normal.vep.vcf --fork 2 --check_allele --cache_version 89 --polyphen b --gmaf --maf_1kg --maf_esp --regulatory

For GRCh37 -> hg19, check this option from VEP cache: --synonyms [file]

image

Also check here from GATK, GRCh37 vs hg19: https://software.broadinstitute.org/gatk/documentation/article?id=23390

image

cgpu commented 5 years ago

But look at this:

https://www.cureffi.org/2012/09/19/exome-sequencing-pipeline-using-gatk/

hg19 is the same as GRCh37, though as GRCh37 getting assembly patches from the GRC while the main chromosomes may not change some of the alternative haplotypes might not always be identical