mskcc / vcf2maf

Convert a VCF into a MAF, where each variant is annotated to only one of all possible gene isoforms
Other
369 stars 215 forks source link

HGVSc + HGVSp + HGVSp_Short not available after vcf2maf from an annotated VCF from sarek with VEP v108 #346

Open ChristianRohde opened 1 year ago

ChristianRohde commented 1 year ago

Hi,

I am running vcf2maf installed from conda. I skip VEP, since I prepared the VEP annotated VCF with sarek (https://nf-co.re/sarek/3.2.1/usage). Here the VEP version used is 108. The problem is that the HGVSc & HGVSp & HGVSp_Short columns are empty (NA). Now I am searching for a solution. Running VEP from vcf2maf would be an option: The problem here is that could install VEP from conda as well and this will downloaded VEP version 109, but I failed running the vcf2maf script at the VEP step. Basically for me it seems that also from vcf2maf I run VEP independent from vcf2maf. Next, I could run vcf2maf with --inhibit-vep. Maybe you have to clarify which version of VEP can be used with which version of vcf2maf?

Best, Christian

ChristianRohde commented 1 year ago

I understand your warning: "In standard operation, vcf2maf runs VEP with very specific parameters to make sure everyone produces comparable MAFs". I installed VEP from conda together with vcf2maf and it is right there available in bash once I load the enviroment (I do load!). Somehow the script requires the --vep-path and tries to execute VEP from command line with perl. OK, at least it uses the envs perl, see below. Maybe the problem is a minor one, but somehow it still fails. This is the error when I try to run vcf2maf+VEP:

Unknown option: af_esp ERROR: Failed to parse command-line flags

ERROR: Failed to run the VEP annotator! Command: /path2/conda/envs/VCF2MAF/bin/perl /path2/conda/envs/VCF2MAF/bin/vep --species homo_sapiens --assembly GRCh38 --no_progress --no_stats --buffer_size 5000 --sift b --ccds --uniprot --hgvs --symbol --numbers --domains --gene_phenotype --canonical --protein --biotype --uniprot --tsl --variant_class --shift_hgvs 1 --check_existing --total_length --allele_number --no_escape --xref_refseq --failed 1 --vcf --flag_pick_allele --pick_order canonical,tsl,biotype,rank,ccds,length --dir /path2/HOME/.vep --fasta Homo_sapiens_assembly38.fasta --format vcf --input_file ./results/variant_calling/freebayes/P1/P1.freebayes.vcf --output_file ./results/variant_calling/freebayes/P1/P1.freebayes.vep.vcf --offline --pubmed --fork 4 --polyphen b --af --af_1kg --af_esp --af_gnomad --regulatory

Best, Christian

tamuanand commented 1 year ago

Hi

See if these 2 links will help (I am not sure)

FriederikeHanssen commented 12 months ago

Hey @ChristianRohde ! I just ran into a similar issue and found out that you can retain a bunch of information specifying your own keys. I used sarek in combination with strelka,mutect2, and VEP together with the dbnsfp (not sure if relevant, but no time to test out other options) for annotation. I then added:

--retain-info HGVSp_VEP,HGVSc_VEP --retain-fmt HGVSp_VEP,HGVSc_VEP --retain-ann HGVSp_VEP,HGVSc_VEP

to keep some of these keys. I am unsure, if I really need to specify all three, but this worked. In my case I then imported it to maftools to plot various things and it worked as expected.