openvar / variantValidator

Public repository for VariantValidator project
GNU Affero General Public License v3.0
67 stars 21 forks source link

Exon numbering - Gather user requirements #251

Closed i3hsInnovation closed 2 years ago

i3hsInnovation commented 3 years ago

We have had several requests to provide exon numbering. I have always avoided this because it is a total mine field

Although this is a seemingly trivial thing to do but it is actually hugely complicated because of how people like to number exons. For example,

I would like to gather user requirements around this.

Peter-J-Freeman commented 3 years ago

I was thinking about this as well @ifokkema. The start and end positions are already mapped. It would then just be a case of adding the exon boundaries into the existing dictionary.

I agree that this would be simpler. VariantValidator could then just return the exon position as agreed

    "primary_assembly_loci": {
      "grch37": {
        "hgvs_genomic_description": "NC_000017.10:g.48275363C>A",
        "vcf": {
          "alt": "A",
          "chr": "17",
          "pos": "48275363",
          "ref": "C"
        },
        "exonic_location": 
            {"start": "X/Y",
             "end": "X/Y"
        }, 

Then to genes2transcripts I'll add the exon structures too so that you can do whatever the hell you want with it ;). Definately agree this makes more sense given the current function available.

Peter-J-Freeman commented 3 years ago

genes to transcripts endpoint then becomes

{
  "current_name": "collagen type I alpha 1 chain",
  "current_symbol": "COL1A1",
  "previous_name": "collagen, type I, alpha 1",
  "previous_symbol": "COL1A1",
  "transcripts": [
    {
      "coding_end": 4521,
      "coding_start": 127,
      "description": "Homo sapiens collagen type I alpha 1 chain (COL1A1), mRNA",
      "genomic_spans": {
        "NC_000017.10": {
          "end_position": 48279000,
          "start_position": 48261457
          "exonic_structure": 
            [[tx_start, tx_end, geno_start, geno_end], [............]],         
        },
Peter-J-Freeman commented 3 years ago

This is definately my preference. Also very easy to code up. All OK with that?????

If so I will upgrade genes2transcripts today because I want to code for a while. Been doing too much teaching 👍

Peter-J-Freeman commented 3 years ago

Picked up here for adding exon structure https://github.com/openvar/variantValidator/issues/253

@beboche and @ifokkema please take a look and provide comments.

ifokkema commented 3 years ago

Then to genes2transcripts I'll add the exon structures too so that you can do whatever the hell you want with it ;). Definately agree this makes more sense given the current function available.

Excellent!

If so I will upgrade genes2transcripts today because I want to code for a while. Been doing too much teaching +1

Ah haha I hope you had fun :smile: