openvax / varcode

Library for manipulating genomic variants and predicting their effects
Apache License 2.0
81 stars 24 forks source link

Provide mutation info with StartLosses #206

Open joaoe opened 7 years ago

joaoe commented 7 years ago

Both VEP and ANNOVAR mark start losses but include the information regarding changes in the coding sequence and aminoacids. For instance, the json from VEP with variant chr10 58224929 A G:

{
"allele_string": "A/G",
"assembly_name": "GRCm38",
"end": 58224929,
"id": "10_58224929_A/G",
"input": "chr10\t58224929\t.\tA\tG\t4567\t.",
"most_severe_consequence": "start_lost",
"seq_region_name": 10,
"start": 58224929,
"strand": 1,
"transcript_consequences": [{
  "amino_acids": "M/T",
  "cdna_end": 2,
  "cdna_start": 2,
  "cds_end": 2,
  "cds_start": 2,
  "codons": "aTg/aCg",
  "consequence_terms": ["start_lost"],
  "gene_id": "ENSMUSG00000058537",
  "impact": "HIGH",
  "protein_end": 1,
  "protein_start": 1,
  "source": "Ensembl",
  "strand": -1,
  "transcript_id": "ENSMUST00000204003",
  "variant_allele": "G",
}],

Would perhaps be nice to annotate these as regular annotations (Substitution, Insertion, Deletion), and then wrap them with a StartLoss in a property called alternate_effect the same way it happens for ExonicSpliceSite. Or just include aa_ref and aa_alt. (which is not that useful if the annotation has other consequences, like being a Frameshift or PrematureStop

ijhoskins commented 5 years ago

I concur with @joaoe 's suggestion above. It'd be nice to have the aa_alt attribute for this effect class, instead of the p.M1? annotation.