mskcc / vcf2maf

Convert a VCF into a MAF, where each variant is annotated to only one of all possible gene isoforms
Other
374 stars 217 forks source link

Unrecognized biotype/effect #349

Open hosseinvk opened 1 year ago

hosseinvk commented 1 year ago

Hi,

I have run vcf2maf using vep 108, while the runs complete, it prints large number of warnings as following:

WARNING: Unrecognized biotype "protein_coding_CDS_not_defined". Assigning lowest priority!
WARNING: Unrecognized effect "splice_polypyrimidine_tract_variant". Assigning lowest priority!

Would this be a concern, given that vcfs were annotated with VEP earlier? Thanks for advice.

gianfilippo commented 1 year ago

Hi,

same issue here. I updated VEP and cache to 107. I am getting lots of WARNING: Unrecognized biotype "protein_coding_LoF". Assigning lowest priority! WARNING: Unrecognized effect "splice_polypyrimidine_tract_variant". Assigning lowest priority!

Can you please advice ? Thanks

Teezi commented 11 months ago

Hi guys, I'm also experiencing the same problem (please see the "warning content" below), but I'm happy to find that it's not a major issue -- As the newer version of Ensembl (v110 in my case) contains new biotypes/effects that are not yet included in vcf2maf.

For silencing those warnings, my suggestion is to add those "Unrecognized biotype/effect" into the source code under %biotype_priority or %effectPriority sections, e.g., by checking the meaning of protein_coding_CDS_not_defined, we know that: protein_coding_CDS_not_defined means Replaces the “processed_transcript” transcript biotype in protein_coding genes, so we can prioritise protein_coding_CDS_not_defined similar to processed_transcript.

For the corresponding meanings of biotypes and effects, please refer here:

  1. Gene/Transcript Biotypes in GENCODE & Ensembl,
  2. Ensembl Variation - Calculated variant consequences

Hope it helps, Cheers!

---------------------- Here is the problem ------------------ I've used Ensembl VEP cache version 110, and got 90+ lines of warnings, and these warnings can be summarised into following types:

WARNING: Unrecognized biotype "protein_coding_CDS_not_defined". Assigning lowest priority!
WARNING: Unrecognized biotype "protein_coding_LoF". Assigning lowest priority!
WARNING: Unrecognized effect "splice_donor_5th_base_variant". Assigning lowest priority!
WARNING: Unrecognized effect "splice_donor_region_variant". Assigning lowest priority!
WARNING: Unrecognized effect "splice_polypyrimidine_tract_variant". Assigning lowest priority!