EBIvariation / CMAT

ClinVar Mapping and Annotation Toolkit
Apache License 2.0
17 stars 10 forks source link

Add transcripts to consequences #405

Closed apriltuesday closed 6 months ago

apriltuesday commented 6 months ago

Adds all Ensembl transcript IDs corresponding to the most severe consequence per overlapping gene (or the overall most severe consequence if no genes overlap), when requested by command-line flag.

Example of output XML (compare ClinVar record):

<Measure Type="single nucleotide variant" ID="15941">
  <Name>
    <ElementValue Type="Preferred">NM_000181.4(GUSB):c.1730G&gt;T (p.Arg577Leu)</ElementValue>
  </Name>
  <SequenceLocation Assembly="GRCh38" AssemblyAccessionVersion="GCF_000001405.38" AssemblyStatus="current" Chr="7" Accession="NC_000007.14" start="65964382" stop="65964382" display_start="65964382" display_stop="65964382" variantLength="1" positionVCF="65964382" referenceAlleleVCF="C" alternateAlleleVCF="A"/>
  ...
  [snip]
  ...
  <AttributeSet providedBy="CMAT">
    <Attribute Type="MolecularConsequence">missense variant</Attribute>
    <XRef ID="SO:0001583" DB="Sequence Ontology"/>
    <XRef ID="ENSG00000169919" DB="Ensembl Gene"/>
    <XRef ID="ENST00000304895" DB="Ensembl Transcript"/>
  </AttributeSet>
  <AttributeSet providedBy="CMAT">
    <Attribute Type="MolecularConsequence">missense variant</Attribute>
    <XRef ID="SO:0001583" DB="Sequence Ontology"/>
    <XRef ID="ENSG00000169919" DB="Ensembl Gene"/>
    <XRef ID="ENST00000421103" DB="Ensembl Transcript"/>
  </AttributeSet>
</Measure>