openvar / variantValidator

Public repository for VariantValidator project
GNU Affero General Public License v3.0
70 stars 21 forks source link

Strange mapping for a variant #306

Closed Peter-J-Freeman closed 3 years ago

Peter-J-Freeman commented 3 years ago

Describe the bug Hi Variant Validator, I think I am seeing a bug on the web interface. This is the VCF that came off our GRCh37 based pipeline: 2:21232802:ATG:ACA. When I enter it into the VV web tool I see the following warning: "NC_000002.11:g.21232802ATG>ACA automapped to NC_000002.11:g.21232803_21232804inv", which makes sense to me relative to GRCh37 sequence. However I get the following HGVS descriptions which are for an entirely difference part of the genome. Transcript (:c.) NM_000384.2:c.11788G= NM_000384.2 Protein (:p.) NP_000375.2:p.(Val3930=) NP_000375.2 NC_000002.11:g.21227952C= GRCh37:2:21227952:C:C NC_000002.11 I am not sure how I got from 2:2123283 to 2:21227952. I also tried the pseudo-VCF 2:21232803:TG:CA with the same result. So far this is the only variant I see this issue with.

To Reproduce Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

Smartphone (please complete the following information):

Additional context Add any other context about the problem here.

Peter-J-Freeman commented 3 years ago

I've finally had chance to start looking at this.

It seems that there might be a funky with GRCh37 in the middle of this gene wrt the .2 version of the transcript

    {
      "coding_end": 13820,
      "coding_start": 129,
      "description": "Homo sapiens apolipoprotein B (APOB), mRNA",
      "genomic_spans": {
        "NC_000002.11": {
          "end_position": 21266945,
          "exon_structure": [
            {
              "cigar": "210=",
              "exon_number": 29,
              "genomic_end": 21266945,
              "genomic_start": 21266736,
              "transcript_end": 210,
              "transcript_start": 1
            },
            {
              "cigar": "39=",
              "exon_number": 28,
              "genomic_end": 21266423,
              "genomic_start": 21266385,
              "transcript_end": 249,
              "transcript_start": 211
            },
            {
              "cigar": "116=",
              "exon_number": 27,
              "genomic_end": 21265348,
              "genomic_start": 21265233,
              "transcript_end": 365,
              "transcript_start": 250
            },
            {
              "cigar": "146=",
              "exon_number": 26,
              "genomic_end": 21263955,
              "genomic_start": 21263810,
              "transcript_end": 511,
              "transcript_start": 366
            },
            {
              "cigar": "154=",
              "exon_number": 25,
              "genomic_end": 21260983,
              "genomic_start": 21260830,
              "transcript_end": 665,
              "transcript_start": 512
            },
            {
              "cigar": "156=",
              "exon_number": 24,
              "genomic_end": 21260127,
              "genomic_start": 21259972,
              "transcript_end": 821,
              "transcript_start": 666
            },
            {
              "cigar": "125=",
              "exon_number": 23,
              "genomic_end": 21258580,
              "genomic_start": 21258456,
              "transcript_end": 946,
              "transcript_start": 822
            },
            {
              "cigar": "86=",
              "exon_number": 22,
              "genomic_end": 21257773,
              "genomic_start": 21257688,
              "transcript_end": 1032,
              "transcript_start": 947
            },
            {
              "cigar": "220=",
              "exon_number": 21,
              "genomic_end": 21256390,
              "genomic_start": 21256171,
              "transcript_end": 1252,
              "transcript_start": 1033
            },
            {
              "cigar": "228=",
              "exon_number": 20,
              "genomic_end": 21255453,
              "genomic_start": 21255226,
              "transcript_end": 1480,
              "transcript_start": 1253
            },
            {
              "cigar": "118=",
              "exon_number": 19,
              "genomic_end": 21252887,
              "genomic_start": 21252770,
              "transcript_end": 1598,
              "transcript_start": 1481
            },
            {
              "cigar": "147=",
              "exon_number": 18,
              "genomic_end": 21252657,
              "genomic_start": 21252511,
              "transcript_end": 1745,
              "transcript_start": 1599
            },
            {
              "cigar": "212=",
              "exon_number": 17,
              "genomic_end": 21251410,
              "genomic_start": 21251199,
              "transcript_end": 1957,
              "transcript_start": 1746
            },
            {
              "cigar": "238=",
              "exon_number": 16,
              "genomic_end": 21250937,
              "genomic_start": 21250700,
              "transcript_end": 2195,
              "transcript_start": 1958
            },
            {
              "cigar": "177=",
              "exon_number": 15,
              "genomic_end": 21249836,
              "genomic_start": 21249660,
              "transcript_end": 2372,
              "transcript_start": 2196
            },
            {
              "cigar": "192=",
              "exon_number": 14,
              "genomic_end": 21247996,
              "genomic_start": 21247805,
              "transcript_end": 2564,
              "transcript_start": 2373
            },
            {
              "cigar": "168=",
              "exon_number": 13,
              "genomic_end": 21246564,
              "genomic_start": 21246397,
              "transcript_end": 2732,
              "transcript_start": 2565
            },
            {
              "cigar": "212=",
              "exon_number": 12,
              "genomic_end": 21245914,
              "genomic_start": 21245703,
              "transcript_end": 2944,
              "transcript_start": 2733
            },
            {
              "cigar": "183=",
              "exon_number": 11,
              "genomic_end": 21242777,
              "genomic_start": 21242595,
              "transcript_end": 3127,
              "transcript_start": 2945
            },
            {
              "cigar": "122=",
              "exon_number": 10,
              "genomic_end": 21241985,
              "genomic_start": 21241864,
              "transcript_end": 3249,
              "transcript_start": 3128
            },
            {
              "cigar": "211=",
              "exon_number": 9,
              "genomic_end": 21239521,
              "genomic_start": 21239311,
              "transcript_end": 3460,
              "transcript_start": 3250
            },
            {
              "cigar": "176=",
              "exon_number": 8,
              "genomic_end": 21238417,
              "genomic_start": 21238242,
              "transcript_end": 3636,
              "transcript_start": 3461
            },
            {
              "cigar": "188=",
              "exon_number": 7,
              "genomic_end": 21238132,
              "genomic_start": 21237945,
              "transcript_end": 3824,
              "transcript_start": 3637
            },
            {
              "cigar": "146=",
              "exon_number": 6,
              "genomic_end": 21237465,
              "genomic_start": 21237320,
              "transcript_end": 3970,
              "transcript_start": 3825
            },
            {
              "cigar": "374=",
              "exon_number": 5,
              "genomic_end": 21236405,
              "genomic_start": 21236032,
              "transcript_end": 4344,
              "transcript_start": 3971
            },
            {
              "cigar": "48=1X2671=1X4851=",
              "exon_number": 4,
              "genomic_end": 21235523,
              "genomic_start": 21227952,
              "transcript_end": 11916,
              "transcript_start": 4345
            },
            {
              "cigar": "115=",
              "exon_number": 3,
              "genomic_end": 21227547,
              "genomic_start": 21227433,
              "transcript_end": 12031,
              "transcript_start": 11917
            },
            {
              "cigar": "184=",
              "exon_number": 2,
              "genomic_end": 21227324,
              "genomic_start": 21227141,
              "transcript_end": 12215,
              "transcript_start": 12032
            },
            {
              "cigar": "1906=",
              "exon_number": 1,
              "genomic_end": 21226206,
              "genomic_start": 21224301,
              "transcript_end": 14121,
              "transcript_start": 12216
            }
         "orientation": -1,
          "start_position": 21224301,
          "total_exons": 29

See the CIGAR string in Exon 4.

The alignment is perfect for GRCh38

        "NC_000002.12": {
          "end_position": 21044073,
          "exon_structure": [
            {
              "cigar": "210=",
              "exon_number": 29,
              "genomic_end": 21044073,
              "genomic_start": 21043864,
              "transcript_end": 210,
              "transcript_start": 1
            },
            {
              "cigar": "39=",
              "exon_number": 28,
              "genomic_end": 21043551,
              "genomic_start": 21043513,
              "transcript_end": 249,
              "transcript_start": 211
            },
            {
              "cigar": "116=",
              "exon_number": 27,
              "genomic_end": 21042476,
              "genomic_start": 21042361,
              "transcript_end": 365,
              "transcript_start": 250
            },
            {
              "cigar": "146=",
              "exon_number": 26,
              "genomic_end": 21041083,
              "genomic_start": 21040938,
              "transcript_end": 511,
              "transcript_start": 366
            },
            {
              "cigar": "154=",
              "exon_number": 25,
              "genomic_end": 21038111,
              "genomic_start": 21037958,
              "transcript_end": 665,
              "transcript_start": 512
            },
            {
              "cigar": "156=",
              "exon_number": 24,
              "genomic_end": 21037255,
              "genomic_start": 21037100,
              "transcript_end": 821,
              "transcript_start": 666
            },
            {
              "cigar": "125=",
              "exon_number": 23,
              "genomic_end": 21035708,
              "genomic_start": 21035584,
              "transcript_end": 946,
              "transcript_start": 822
            },
            {
              "cigar": "86=",
              "exon_number": 22,
              "genomic_end": 21034901,
              "genomic_start": 21034816,
              "transcript_end": 1032,
              "transcript_start": 947
            },
            {
              "cigar": "220=",
              "exon_number": 21,
              "genomic_end": 21033518,
              "genomic_start": 21033299,
              "transcript_end": 1252,
              "transcript_start": 1033
            },
            {
              "cigar": "228=",
              "exon_number": 20,
              "genomic_end": 21032581,
              "genomic_start": 21032354,
              "transcript_end": 1480,
              "transcript_start": 1253
            },
            {
              "cigar": "118=",
              "exon_number": 19,
              "genomic_end": 21030015,
              "genomic_start": 21029898,
              "transcript_end": 1598,
              "transcript_start": 1481
            },
            {
              "cigar": "147=",
              "exon_number": 18,
              "genomic_end": 21029785,
              "genomic_start": 21029639,
              "transcript_end": 1745,
              "transcript_start": 1599
            },
            {
              "cigar": "212=",
              "exon_number": 17,
              "genomic_end": 21028538,
              "genomic_start": 21028327,
              "transcript_end": 1957,
              "transcript_start": 1746
            },
            {
              "cigar": "238=",
              "exon_number": 16,
              "genomic_end": 21028065,
              "genomic_start": 21027828,
              "transcript_end": 2195,
              "transcript_start": 1958
            },
            {
              "cigar": "177=",
              "exon_number": 15,
              "genomic_end": 21026964,
              "genomic_start": 21026788,
              "transcript_end": 2372,
              "transcript_start": 2196
            },
            {
              "cigar": "192=",
              "exon_number": 14,
              "genomic_end": 21025124,
              "genomic_start": 21024933,
              "transcript_end": 2564,
              "transcript_start": 2373
            },
            {
              "cigar": "168=",
              "exon_number": 13,
              "genomic_end": 21023692,
              "genomic_start": 21023525,
              "transcript_end": 2732,
              "transcript_start": 2565
            },
            {
              "cigar": "212=",
              "exon_number": 12,
              "genomic_end": 21023042,
              "genomic_start": 21022831,
              "transcript_end": 2944,
              "transcript_start": 2733
            },
            {
              "cigar": "183=",
              "exon_number": 11,
              "genomic_end": 21019905,
              "genomic_start": 21019723,
              "transcript_end": 3127,
              "transcript_start": 2945
            },
            {
              "cigar": "122=",
              "exon_number": 10,
              "genomic_end": 21019113,
              "genomic_start": 21018992,
              "transcript_end": 3249,
              "transcript_start": 3128
            },
            {
              "cigar": "211=",
              "exon_number": 9,
              "genomic_end": 21016649,
              "genomic_start": 21016439,
              "transcript_end": 3460,
              "transcript_start": 3250
            },
            {
              "cigar": "176=",
              "exon_number": 8,
              "genomic_end": 21015545,
              "genomic_start": 21015370,
              "transcript_end": 3636,
              "transcript_start": 3461
            },
            {
              "cigar": "188=",
              "exon_number": 7,
              "genomic_end": 21015260,
              "genomic_start": 21015073,
              "transcript_end": 3824,
              "transcript_start": 3637
            },
            {
              "cigar": "146=",
              "exon_number": 6,
              "genomic_end": 21014593,
              "genomic_start": 21014448,
              "transcript_end": 3970,
              "transcript_start": 3825
            },
            {
              "cigar": "374=",
              "exon_number": 5,
              "genomic_end": 21013533,
              "genomic_start": 21013160,
              "transcript_end": 4344,
              "transcript_start": 3971
            },
            {
              "cigar": "2720=1X4851=",
              "exon_number": 4,
              "genomic_end": 21012651,
              "genomic_start": 21005080,
              "transcript_end": 11916,
              "transcript_start": 4345
            },
            {
              "cigar": "115=",
              "exon_number": 3,
              "genomic_end": 21004675,
              "genomic_start": 21004561,
              "transcript_end": 12031,
              "transcript_start": 11917
            },
            {
              "cigar": "184=",
              "exon_number": 2,
              "genomic_end": 21004452,
              "genomic_start": 21004269,
              "transcript_end": 12215,
              "transcript_start": 12032
            },
            {
              "cigar": "1906=",
              "exon_number": 1,
              "genomic_end": 21003334,
              "genomic_start": 21001429,
              "transcript_end": 14121,
              "transcript_start": 12216
            }
          ],
          "orientation": -1,
          "start_position": 21001429,
          "total_exons": 29
        },

The .3 version of the transcript is now a MANE select

            "coding_end": 13820,
            "coding_start": 129,
            "description": "Homo sapiens apolipoprotein B (APOB), mRNA",
            "genomic_spans": {
                "NC_000002.11": {
                    "end_position": 21266945,
                    "exon_structure": [
                        {
                            "cigar": "210=",
                            "exon_number": 29,
                            "genomic_end": 21266945,
                            "genomic_start": 21266736,
                            "transcript_end": 210,
                            "transcript_start": 1
                        },
                        {
                            "cigar": "39=",
                            "exon_number": 28,
                            "genomic_end": 21266423,
                            "genomic_start": 21266385,
                            "transcript_end": 249,
                            "transcript_start": 211
                        },
                        {
                            "cigar": "116=",
                            "exon_number": 27,
                            "genomic_end": 21265348,
                            "genomic_start": 21265233,
                            "transcript_end": 365,
                            "transcript_start": 250
                        },
                        {
                            "cigar": "146=",
                            "exon_number": 26,
                            "genomic_end": 21263955,
                            "genomic_start": 21263810,
                            "transcript_end": 511,
                            "transcript_start": 366
                        },
                        {
                            "cigar": "154=",
                            "exon_number": 25,
                            "genomic_end": 21260983,
                            "genomic_start": 21260830,
                            "transcript_end": 665,
                            "transcript_start": 512
                        },
                        {
                            "cigar": "156=",
                            "exon_number": 24,
                            "genomic_end": 21260127,
                            "genomic_start": 21259972,
                            "transcript_end": 821,
                            "transcript_start": 666
                        },
                        {
                            "cigar": "125=",
                            "exon_number": 23,
                            "genomic_end": 21258580,
                            "genomic_start": 21258456,
                            "transcript_end": 946,
                            "transcript_start": 822
                        },
                        {
                            "cigar": "86=",
                            "exon_number": 22,
                            "genomic_end": 21257773,
                            "genomic_start": 21257688,
                            "transcript_end": 1032,
                            "transcript_start": 947
                        },
                        {
                            "cigar": "220=",
                            "exon_number": 21,
                            "genomic_end": 21256390,
                            "genomic_start": 21256171,
                            "transcript_end": 1252,
                            "transcript_start": 1033
                        },
                        {
                            "cigar": "228=",
                            "exon_number": 20,
                            "genomic_end": 21255453,
                            "genomic_start": 21255226,
                            "transcript_end": 1480,
                            "transcript_start": 1253
                        },
                        {
                            "cigar": "118=",
                            "exon_number": 19,
                            "genomic_end": 21252887,
                            "genomic_start": 21252770,
                            "transcript_end": 1598,
                            "transcript_start": 1481
                        },
                        {
                            "cigar": "147=",
                            "exon_number": 18,
                            "genomic_end": 21252657,
                            "genomic_start": 21252511,
                            "transcript_end": 1745,
                            "transcript_start": 1599
                        },
                        {
                            "cigar": "212=",
                            "exon_number": 17,
                            "genomic_end": 21251410,
                            "genomic_start": 21251199,
                            "transcript_end": 1957,
                            "transcript_start": 1746
                        },
                        {
                            "cigar": "238=",
                            "exon_number": 16,
                            "genomic_end": 21250937,
                            "genomic_start": 21250700,
                            "transcript_end": 2195,
                            "transcript_start": 1958
                        },
                        {
                            "cigar": "177=",
                            "exon_number": 15,
                            "genomic_end": 21249836,
                            "genomic_start": 21249660,
                            "transcript_end": 2372,
                            "transcript_start": 2196
                        },
                        {
                            "cigar": "192=",
                            "exon_number": 14,
                            "genomic_end": 21247996,
                            "genomic_start": 21247805,
                            "transcript_end": 2564,
                            "transcript_start": 2373
                        },
                        {
                            "cigar": "168=",
                            "exon_number": 13,
                            "genomic_end": 21246564,
                            "genomic_start": 21246397,
                            "transcript_end": 2732,
                            "transcript_start": 2565
                        },
                        {
                            "cigar": "212=",
                            "exon_number": 12,
                            "genomic_end": 21245914,
                            "genomic_start": 21245703,
                            "transcript_end": 2944,
                            "transcript_start": 2733
                        },
                        {
                            "cigar": "183=",
                            "exon_number": 11,
                            "genomic_end": 21242777,
                            "genomic_start": 21242595,
                            "transcript_end": 3127,
                            "transcript_start": 2945
                        },
                        {
                            "cigar": "122=",
                            "exon_number": 10,
                            "genomic_end": 21241985,
                            "genomic_start": 21241864,
                            "transcript_end": 3249,
                            "transcript_start": 3128
                        },
                        {
                            "cigar": "211=",
                            "exon_number": 9,
                            "genomic_end": 21239521,
                            "genomic_start": 21239311,
                            "transcript_end": 3460,
                            "transcript_start": 3250
                        },
                        {
                            "cigar": "176=",
                            "exon_number": 8,
                            "genomic_end": 21238417,
                            "genomic_start": 21238242,
                            "transcript_end": 3636,
                            "transcript_start": 3461
                        },
                        {
                            "cigar": "188=",
                            "exon_number": 7,
                            "genomic_end": 21238132,
                            "genomic_start": 21237945,
                            "transcript_end": 3824,
                            "transcript_start": 3637
                        },
                        {
                            "cigar": "146=",
                            "exon_number": 6,
                            "genomic_end": 21237465,
                            "genomic_start": 21237320,
                            "transcript_end": 3970,
                            "transcript_start": 3825
                        },
                        {
                            "cigar": "374=",
                            "exon_number": 5,
                            "genomic_end": 21236405,
                            "genomic_start": 21236032,
                            "transcript_end": 4344,
                            "transcript_start": 3971
                        },
                        {
                            "cigar": "48=1X7523=",
                            "exon_number": 4,
                            "genomic_end": 21235523,
                            "genomic_start": 21227952,
                            "transcript_end": 11916,
                            "transcript_start": 4345
                        },
                        {
                            "cigar": "115=",
                            "exon_number": 3,
                            "genomic_end": 21227547,
                            "genomic_start": 21227433,
                            "transcript_end": 12031,
                            "transcript_start": 11917
                        },
                        {
                            "cigar": "184=",
                            "exon_number": 2,
                            "genomic_end": 21227324,
                            "genomic_start": 21227141,
                            "transcript_end": 12215,
                            "transcript_start": 12032
                        },
                        {
                            "cigar": "1906=",
                            "exon_number": 1,
                            "genomic_end": 21226206,
                            "genomic_start": 21224301,
                            "transcript_end": 14121,
                            "transcript_start": 12216
                        }
                    ],
                    "orientation": -1,
                    "start_position": 21224301,
                    "total_exons": 29
                },
                "NC_000002.12": {
                    "end_position": 21044073,
                    "exon_structure": [
                        {
                            "cigar": "210=",
                            "exon_number": 29,
                            "genomic_end": 21044073,
                            "genomic_start": 21043864,
                            "transcript_end": 210,
                            "transcript_start": 1
                        },
                        {
                            "cigar": "39=",
                            "exon_number": 28,
                            "genomic_end": 21043551,
                            "genomic_start": 21043513,
                            "transcript_end": 249,
                            "transcript_start": 211
                        },
                        {
                            "cigar": "116=",
                            "exon_number": 27,
                            "genomic_end": 21042476,
                            "genomic_start": 21042361,
                            "transcript_end": 365,
                            "transcript_start": 250
                        },
                        {
                            "cigar": "146=",
                            "exon_number": 26,
                            "genomic_end": 21041083,
                            "genomic_start": 21040938,
                            "transcript_end": 511,
                            "transcript_start": 366
                        },
                        {
                            "cigar": "154=",
                            "exon_number": 25,
                            "genomic_end": 21038111,
                            "genomic_start": 21037958,
                            "transcript_end": 665,
                            "transcript_start": 512
                        },
                        {
                            "cigar": "156=",
                            "exon_number": 24,
                            "genomic_end": 21037255,
                            "genomic_start": 21037100,
                            "transcript_end": 821,
                            "transcript_start": 666
                        },
                        {
                            "cigar": "125=",
                            "exon_number": 23,
                            "genomic_end": 21035708,
                            "genomic_start": 21035584,
                            "transcript_end": 946,
                            "transcript_start": 822
                        },
                        {
                            "cigar": "86=",
                            "exon_number": 22,
                            "genomic_end": 21034901,
                            "genomic_start": 21034816,
                            "transcript_end": 1032,
                            "transcript_start": 947
                        },
                        {
                            "cigar": "220=",
                            "exon_number": 21,
                            "genomic_end": 21033518,
                            "genomic_start": 21033299,
                            "transcript_end": 1252,
                            "transcript_start": 1033
                        },
                        {
                            "cigar": "228=",
                            "exon_number": 20,
                            "genomic_end": 21032581,
                            "genomic_start": 21032354,
                            "transcript_end": 1480,
                            "transcript_start": 1253
                        },
                        {
                            "cigar": "118=",
                            "exon_number": 19,
                            "genomic_end": 21030015,
                            "genomic_start": 21029898,
                            "transcript_end": 1598,
                            "transcript_start": 1481
                        },
                        {
                            "cigar": "147=",
                            "exon_number": 18,
                            "genomic_end": 21029785,
                            "genomic_start": 21029639,
                            "transcript_end": 1745,
                            "transcript_start": 1599
                        },
                        {
                            "cigar": "212=",
                            "exon_number": 17,
                            "genomic_end": 21028538,
                            "genomic_start": 21028327,
                            "transcript_end": 1957,
                            "transcript_start": 1746
                        },
                        {
                            "cigar": "238=",
                            "exon_number": 16,
                            "genomic_end": 21028065,
                            "genomic_start": 21027828,
                            "transcript_end": 2195,
                            "transcript_start": 1958
                        },
                        {
                            "cigar": "177=",
                            "exon_number": 15,
                            "genomic_end": 21026964,
                            "genomic_start": 21026788,
                            "transcript_end": 2372,
                            "transcript_start": 2196
                        },
                        {
                            "cigar": "192=",
                            "exon_number": 14,
                            "genomic_end": 21025124,
                            "genomic_start": 21024933,
                            "transcript_end": 2564,
                            "transcript_start": 2373
                        },
                        {
                            "cigar": "168=",
                            "exon_number": 13,
                            "genomic_end": 21023692,
                            "genomic_start": 21023525,
                            "transcript_end": 2732,
                            "transcript_start": 2565
                        },
                        {
                            "cigar": "212=",
                            "exon_number": 12,
                            "genomic_end": 21023042,
                            "genomic_start": 21022831,
                            "transcript_end": 2944,
                            "transcript_start": 2733
                        },
                        {
                            "cigar": "183=",
                            "exon_number": 11,
                            "genomic_end": 21019905,
                            "genomic_start": 21019723,
                            "transcript_end": 3127,
                            "transcript_start": 2945
                        },
                        {
                            "cigar": "122=",
                            "exon_number": 10,
                            "genomic_end": 21019113,
                            "genomic_start": 21018992,
                            "transcript_end": 3249,
                            "transcript_start": 3128
                        },
                        {
                            "cigar": "211=",
                            "exon_number": 9,
                            "genomic_end": 21016649,
                            "genomic_start": 21016439,
                            "transcript_end": 3460,
                            "transcript_start": 3250
                        },
                        {
                            "cigar": "176=",
                            "exon_number": 8,
                            "genomic_end": 21015545,
                            "genomic_start": 21015370,
                            "transcript_end": 3636,
                            "transcript_start": 3461
                        },
                        {
                            "cigar": "188=",
                            "exon_number": 7,
                            "genomic_end": 21015260,
                            "genomic_start": 21015073,
                            "transcript_end": 3824,
                            "transcript_start": 3637
                        },
                        {
                            "cigar": "146=",
                            "exon_number": 6,
                            "genomic_end": 21014593,
                            "genomic_start": 21014448,
                            "transcript_end": 3970,
                            "transcript_start": 3825
                        },
                        {
                            "cigar": "374=",
                            "exon_number": 5,
                            "genomic_end": 21013533,
                            "genomic_start": 21013160,
                            "transcript_end": 4344,
                            "transcript_start": 3971
                        },
                        {
                            "cigar": "7572=",
                            "exon_number": 4,
                            "genomic_end": 21012651,
                            "genomic_start": 21005080,
                            "transcript_end": 11916,
                            "transcript_start": 4345
                        },
                        {
                            "cigar": "115=",
                            "exon_number": 3,
                            "genomic_end": 21004675,
                            "genomic_start": 21004561,
                            "transcript_end": 12031,
                            "transcript_start": 11917
                        },
                        {
                            "cigar": "184=",
                            "exon_number": 2,
                            "genomic_end": 21004452,
                            "genomic_start": 21004269,
                            "transcript_end": 12215,
                            "transcript_start": 12032
                        },
                        {
                            "cigar": "1906=",
                            "exon_number": 1,
                            "genomic_end": 21003334,
                            "genomic_start": 21001429,
                            "transcript_end": 14121,
                            "transcript_start": 12216
                        }
                    ],
                    "orientation": -1,
                    "start_position": 21001429,
                    "total_exons": 29
                }
            },
            "length": 14121,
            "reference": "NM_000384.3",
            "translation": "NP_000375.3"

Still has issues with Exon 4 in GRCh37 however, VV seems to do a better job with it.

The Cigars are .2 "cigar": "48=1X2671=1X4851=", .3 "cigar": "48=1X7523="

The MANE has been updated to match the genome

NC_000002.11:g.21232803_21232804= ... NM_000384.2:c.6937G>A ... NM_000384.3:c.6936_6937=

This seems to be tripping things up for the .2 version of the transcript. Why????

Peter-J-Freeman commented 3 years ago

hg19_dna range=chr2:21232792-21232814 5'pad=0 3'pad=0 strand=+ repeatMasking=none

ATGCTCAAGAATGTCATTTATTC > |||||||||||||||||||||||||||||||||||||||||||| TACGAGTTCTTACAGTAAATAAG < .3 ---------------X TACGAGTTCTTGCAGTAAATAAG < .2

Peter-J-Freeman commented 3 years ago

So, NC_000002.11:g.21232803_21232804inv is NC_000002.11:g.21232803_21232804invTG or TG>CA

So are we predicting it ought to be NM_000384.2:c.6936_6937delCGinsTG or NM_000384.2:c.6936C>T Needs checking. However, what is VV doing?

Peter-J-Freeman commented 3 years ago

OK, VV was making a very weird mapping decision. From now on, all inversions will be forced into delins format before mapping so the sequences can be dealt with!

Peter-J-Freeman commented 3 years ago

From the next version VV will output the following

{
    "NM_000384.2:c.6936C>T": {
        "alt_genomic_loci": [],
        "annotations": {
            "chromosome": "2",
            "db_xref": {
                "CCDS": "CCDS1703.1",
                "ensemblgene": null,
                "hgnc": "HGNC:603",
                "ncbigene": "338",
                "select": "RefSeq"
            },
            "ensembl_select": false,
            "mane_plus_clinical": false,
            "mane_select": false,
            "map": "2p24.1",
            "note": "apolipoprotein B",
            "refseq_select": true,
            "variant": "0"
        },
        "gene_ids": {
            "ccds_ids": [
                "CCDS1703"
            ],
            "ensembl_gene_id": "ENSG00000084674",
            "entrez_gene_id": "338",
            "hgnc_id": "HGNC:603",
            "omim_id": [
                "107730"
            ],
            "ucsc_id": "uc002red.3"
        },
        "gene_symbol": "APOB",
        "genome_context_intronic_sequence": "",
        "hgvs_lrg_transcript_variant": "",
        "hgvs_lrg_variant": "",
        "hgvs_predicted_protein_consequence": {
            "lrg_slr": "",
            "lrg_tlr": "",
            "slr": "NP_000375.2:p.(D2312=)",
            "tlr": "NP_000375.2:p.(Asp2312=)"
        },
        "hgvs_refseqgene_variant": "NG_011793.1:g.39142C>T",
        "hgvs_transcript_variant": "NM_000384.2:c.6936C>T",
        "primary_assembly_loci": {
            "grch37": {
                "hgvs_genomic_description": "NC_000002.11:g.21232804G>A",
                "vcf": {
                    "alt": "A",
                    "chr": "2",
                    "pos": "21232804",
                    "ref": "G"
                }
            },
            "grch38": {
                "hgvs_genomic_description": "NC_000002.12:g.21009932G>A",
                "vcf": {
                    "alt": "A",
                    "chr": "2",
                    "pos": "21009932",
                    "ref": "G"
                }
            },
            "hg19": {
                "hgvs_genomic_description": "NC_000002.11:g.21232804G>A",
                "vcf": {
                    "alt": "A",
                    "chr": "chr2",
                    "pos": "21232804",
                    "ref": "G"
                }
            },
            "hg38": {
                "hgvs_genomic_description": "NC_000002.12:g.21009932G>A",
                "vcf": {
                    "alt": "A",
                    "chr": "chr2",
                    "pos": "21009932",
                    "ref": "G"
                }
            }
        },
        "reference_sequence_records": {
            "protein": "https://www.ncbi.nlm.nih.gov/nuccore/NP_000375.2",
            "refseqgene": "https://www.ncbi.nlm.nih.gov/nuccore/NG_011793.1",
            "transcript": "https://www.ncbi.nlm.nih.gov/nuccore/NM_000384.2"
        },
        "refseqgene_context_intronic_sequence": "",
        "selected_assembly": "GRCh37",
        "submitted_variant": "NC_000002.11:g.21232803_21232804inv",
        "transcript_description": "Homo sapiens apolipoprotein B (APOB), mRNA",
        "validation_warnings": [
            "A more recent version of the selected reference sequence NM_000384.2 is available (NM_000384.3): NM_000384.3:c.6936C>T MUST be fully validated prior to use in reports: select_variants=NM_000384.3:c.6936C>T",
            "RefSeqGene record not available"
        ]
    },
    "NM_000384.3:c.6936_6937inv": {
        "alt_genomic_loci": [],
        "annotations": {
            "chromosome": "2",
            "db_xref": {
                "CCDS": "CCDS1703.1",
                "ensemblgene": null,
                "hgnc": "HGNC:603",
                "ncbigene": "338",
                "select": "MANE"
            },
            "ensembl_select": false,
            "mane_plus_clinical": false,
            "mane_select": true,
            "map": "2p24.1",
            "note": "apolipoprotein B",
            "refseq_select": true,
            "variant": "0"
        },
        "gene_ids": {
            "ccds_ids": [
                "CCDS1703"
            ],
            "ensembl_gene_id": "ENSG00000084674",
            "entrez_gene_id": "338",
            "hgnc_id": "HGNC:603",
            "omim_id": [
                "107730"
            ],
            "ucsc_id": "uc002red.3"
        },
        "gene_symbol": "APOB",
        "genome_context_intronic_sequence": "",
        "hgvs_lrg_transcript_variant": "",
        "hgvs_lrg_variant": "",
        "hgvs_predicted_protein_consequence": {
            "lrg_slr": "",
            "lrg_tlr": "",
            "slr": "NP_000375.3:p.(I2313V)",
            "tlr": "NP_000375.3:p.(Ile2313Val)"
        },
        "hgvs_refseqgene_variant": "",
        "hgvs_transcript_variant": "NM_000384.3:c.6936_6937inv",
        "primary_assembly_loci": {
            "grch37": {
                "hgvs_genomic_description": "NC_000002.11:g.21232803_21232804inv",
                "vcf": {
                    "alt": "CA",
                    "chr": "2",
                    "pos": "21232803",
                    "ref": "TG"
                }
            },
            "grch38": {
                "hgvs_genomic_description": "NC_000002.12:g.21009931_21009932inv",
                "vcf": {
                    "alt": "CA",
                    "chr": "2",
                    "pos": "21009931",
                    "ref": "TG"
                }
            },
            "hg19": {
                "hgvs_genomic_description": "NC_000002.11:g.21232803_21232804inv",
                "vcf": {
                    "alt": "CA",
                    "chr": "chr2",
                    "pos": "21232803",
                    "ref": "TG"
                }
            },
            "hg38": {
                "hgvs_genomic_description": "NC_000002.12:g.21009931_21009932inv",
                "vcf": {
                    "alt": "CA",
                    "chr": "chr2",
                    "pos": "21009931",
                    "ref": "TG"
                }
            }
        },
        "reference_sequence_records": {
            "protein": "https://www.ncbi.nlm.nih.gov/nuccore/NP_000375.3",
            "transcript": "https://www.ncbi.nlm.nih.gov/nuccore/NM_000384.3"
        },
        "refseqgene_context_intronic_sequence": "",
        "selected_assembly": "GRCh37",
        "submitted_variant": "NC_000002.11:g.21232803_21232804inv",
        "transcript_description": "Homo sapiens apolipoprotein B (APOB), mRNA",
        "validation_warnings": [
            "RefSeqGene record not available"
        ]
    },
    "flag": "gene_variant",
    "metadata": {
        "variantvalidator_hgvs_version": "2.0.1.dev1+gb3a18e0",
        "variantvalidator_version": "1.0.4.dev214+g9cf795f.d20210407",
        "vvdb_version": "vvdb_2021_4",
        "vvseqrepo_db": "VV_SR_2021_2/master",
        "vvta_version": "vvta_2021_2"
    }
}

Note, we can only correct the g>c for the .2 transcript version, but cannot then get back to the input genomic inversion.

Peter-J-Freeman commented 3 years ago

Test added, and all other tests pass