opentargets / issues

Issue tracker for Open Targets Platform and Open Targets Genetics Portal
https://platform.opentargets.org https://genetics.opentargets.org
Apache License 2.0
12 stars 2 forks source link

Handling complex variant ids #3335

Open prashantuniyal02 opened 3 weeks ago

prashantuniyal02 commented 3 weeks ago

The variant page in the new platform should be able to handle complex ids.

Background

Certain variants (insertion/deletion) might have an id that looks like X_111122_A_ATATATA . The last bit can be of variable length because it implies an A was substituted with a chain of a longer length.

Out of 1.8M total clinvar variants
Number of variants with length > 20: 28062
Number of variants with length > 50: 4222
Number of variants with length > 100: 2486
Number of variants with length > 200: 1855
Number of variants with length > 500: 1429
Number of variants with length > 1000: 1149
Number of variants with length > 2000: 869
Number of variants with length > 5000: 415
Number of variants with length > 10000: 6

Tasks

DSuveges commented 3 weeks ago

These are the numbers from ClinVar data

Number of variants with length > 20: 28062
Number of variants with length > 50: 4222
Number of variants with length > 100: 2486
Number of variants with length > 200: 1855
Number of variants with length > 500: 1429
Number of variants with length > 1000: 1149
Number of variants with length > 2000: 869
Number of variants with length > 5000: 415
Number of variants with length > 10000: 6

Extreme example:

'X_86041956_ACCACATTCAGGCATCCTGGGAGAAGATTAAGTTAGAAATATAGGAAATCACTTGTGTTAAGGCTGGCACAAACAGATATTTGGTAAATATCAGTTTTCACAACTCCTCCTGCAATATTACTATCCCCCAAGGCTTACAGAAGTGATGCTATTCCAACATCATCATAAGTCAATGAGCTGCCAGTCAAGGTTAACTACCTTAATGTTTAAATTCACCAACATCAAGGAATGGGACTGGAACATCCCAACATCAAGGAAAGGCTACTTAAGTGACCTTTTCTGAAAAAAGCAGCCTATCATTTTTAAGAAAATGCCTTTTCTAATCAGTCACTGTATTAGGCTGTTCCTGTGTTGCCATAAAGGAATGCCTGAAACTGGGTAATTTATAAAGAAAAGAGGTTTAATTGGCTCAGGGTTCTGCAGGCTGTACAAGCATGGTACAGGCATCTGCTCGGCTTCTAGGGAGGCCTGAGGGAGCTTTTAGTCATAGCAGAAGGCAAAGGGGAAGCAGGCATCTCCCTTGGTGAGAGGCAAGGCAAGCAAGTGTGGGGAGATGCCACACACTTTTAAACAACCAGTCGACCAGGCACAGTGGCTCCCATCTGTAATCCCAGCAATTTGGGCCACCAAGGCAGGTAGATTGCTTGAGTTCAGAAGTGCAAAACCAGCCTGGGCAACATAGGCAGACCCCATCTCTACAAAAATAAAAAAAAAAAAATTAGCCAGGCATGGTGATGCATGTCTGTAATCTCAGCTACTCAGGAGTCTGAGGTGGGAGGATTGCTTGGGCCCAGGAGGTTAAGGCTGCAGTGGGCCATGATTGTGCCACTGCACTCCAACTAGGGTGACAGAGCAAGACCCTGTCTCAAAAAAAAAATTAAAAAATAAACCACCAGATCCTGCAAGAACGAATTCACTATTACAAGAAGAGTACCAAGCCATGAGGGATCCACTCCCATGACCCAAACACATCCCATCAGGCCCCACCTCCGACAACGAGGATAACATTTCAACATGAGATTTGGGCAGGACAAATATCTAAACCCTATCAGCCATTTTTATAGGAAAATTTCTGAAGTATATGAAACACTGACTTTCCTTCTTAAAAAGACTACACAGAACAGAATTTAGCCTTTTCTCAAGGACATGGTGTCATAAATAGATATATACAATGGTAATTTTAGGGCTGGTAGATCCCTACCTTGGAGATCATCTATTGAAATTTCTCACTTTACAGATGAACTAAATGATGTCACAAGAAATTCTGTAATTTTCCCAAAATCACACAATAGGATGGTCATGTGATTATCATACACCATGCTGCCACATCTACTACTGTATGGTGCTAAACACTATCTAGCAAGATAATCAGGATTTAGGGGCAGTACTCCTTATAATTTCCATGTTTATCTTTGCTATCCCTTCTCTTTCCCCATAATCTTTAGCAATAAAACTTTTTTTTTTTTTTGAGATAGGGTCTTGCTCTGTTGCCCAGGCTGAAGTGCAGTGACACAACTATAGCTCACTGCAGCCTCAAACTCCCAGTTTCAGGCAATCCTCCCACCTCAGCCTTCCAAGTAGCTGGGACTACAGGCACGTGTCACCATACCCAATTAATTTTTGTATTTTTTTTGTAGAAATACAGTTTTGCTGTGTCACCCATTTCAGCTACTTGGGAGGCTGAGGTTTCAAACTCCTGAGCTCAAGCAATCCACCTGCCTCGGCTTCCCCAAGTGCTGGGATTACAGGCCTGAGCCACTGCACCTGGCCAATGAAACTTTTTTTTTTTTTTGAGACAGTCTCACTCTGTTTCTCTGACTAGAGTGCAGTGGTGCCATCTAAGCTCACTGCAACCTCCGCCTTCCAGGATCAAGGGATGCCTGTGCCTTAGCCTATAGAGTAATGGGGATTACAGGCCCGTGCCACCATGCTTGGCTAACAATAAAACTCTTAAATCAAAAGAATATATGTGTTTATCCTATGCAGTAGAAAAATTAAGCTAACCTACTCTGCAGCCCATCTTAGGGGAAGAACTCCAAAAGATGTGAGCAGTGTGCTTTTAAAAAGAAAAAATGCAGGCTGAATGTCTACATGGTCTTCAAAAAGAAAGAAGGATGCAATGAAAAGCCTACATTGTCTTAAGAACAGCTTTGGGGATATGACTGTTAGGAAGGCCAGAACGTTCCTGAGAATAGGCAATACAGTAAACTTCACAAGGGAAGGGACTAGGTTCTCAGTAAATGTGGGTTGAATGACCTGAAAAATGAGTATTTGTCACTACACTAAATGTTATCAGGCCTTTGTGATGTTTGATGTAAGTTTGCATTTTAGGTGTTTTTCTCTTCCCTTTCCTATACAAACTGTCATTTGTATGTCTAATAGCAACCCCACCAAACATAGTCTTTGCAGAAGTTGATAACATTCCTTGTATCTTCTAAGGGGGTGGTTCTAATATGCTGAGCTTGCTAGAATTATTTTTAAAATGAGGATGAATATGTTCTAAGTGATGATCTCAGGGATTCCCCTATGTATGAAGCATGTCAATATTATCATATAGATCTTATAACTTCCATCTTGACTGCTCTTCTGTTGAGTTCTTGGCTCTGCTGTGCTGCAGATTACACAGGCTTCCAGATAAGTCACCTCCAGTGAGATGTAATGTTACTGGGACGACTTTTAACGTTCAACCAATGGTACTTGTCTATAATCCACAAAACTGTCATTTAAATGACTTTTTAAATGGCAAAATACCCTGAAAATGACAATTCTGTACCCAAGCTGAAGAGCACCTTCTAGACCCTTACCACATATATCTATTTTTATCCTCTCCCTTGAGACTCCAGTTGGCCCTCCTTCTTTATTCCATTTTCTTCTATTTTTTCCGCAATTCCACTTTAGCTTACTCTTCTGTCTTTGGATGTTCAAGAGATATTTACTGTATTTTCTAATCAGAAGGAAATGGGCTTAACTAAGAAAATACAAAGTTTATTTTTTTCCATAATCTCTCGAAAATATGATCTTAAGGAACAATTTCCATCTTGCATTACAGAAAATGATAATGCAAGTTAATCAAGGTGAAAGGGATTATAAAATATTATCTTCCATGATTATACTTGATGAAGCAATTGTTCCTTTATGACATTTTAACTAGGACTAAATCTGCTAAAAAGTATATCTAATAGCCTATTTCTTCATGTTTCTGATGCATGCATTATTGCTTCTACTTTTATTAATACAACTGAATCTATGAAACACTGCTTTTTCTATGATTTGCTCAATGTAATACATTTAATAAAGCAAACATTTTAAAGGCTACTTTCAGCAAACTATTTCACAAGAGAAATATCCCAAGCACACTAAAAAATAAATCAAATTAAATCATGATTGGAATTGTTGGAATCATTTATTTGGGGAATGACTCTTTTATTCAATTTTTACACTTTTGATGTACCTAAAAGCAAAATTACATATCATAAATTTAAGCAATGTGCTTGATAAAACCTAGAATGTGTCACCAATAAACAATATGAAAAATTTAAGTTTTAAGATTAACTTCAAATAAGTTCTTCCCAAACACTGAAAAATAACATTGCACTCACTGGGATAAGTATAAAATGAGTATTTTTCCAAATATAAGCGATGTCACCTGATGCAATTCTCTGGCTCACTCACTGTAACTTCCTACCTCTACCATAACAGACTCCATCAGGATGGATGTCTTCCAATTCTGAACTGTACAGAAGGCAAAATATTTATGAAAAACATTAGCATCATTCATAAGAATTAACATTCATCAATTAAAATTTGAACCCAGACCTGAAAAAAAAAATTGTTGATTAAAAAAATTGTTAAAAACACCTCAAGTTCCTACACAAAAAAAGCAAGATTATAAAACTCAAGTATGGTATAACAACTATTCCAAAATGGATGCTGTTCTCAGTTTTAGTATAGACTTAAAAGAAAAAATGAATAGTCCATTGGCTCTTTCCAGCTCTCATCCAACATCTGTTCATAGCAGCTACAACCAGGTTATTTGCAGCCATGCAGTTATTTTCAGGCTAAGAATAAACCTGTCCATGCTGTTCTGAGAACTGTTACCCAGGCATTTTTTGTTAGACAACATAAAGACCATCTATTTTACCTTCCTTATATTCAAAAACAAGGAAACTGAGGCTCAGAGAGATTTAGTGCCTGGCCCTCCCAATTCAAGCAATTCATACTAGCACTAAGCAAGACCTAAAACCAAGTTTTCTCAATTGTTTCTACGAATATGGGTCAACTCTGAGGATCAGAACCACTGACTGTTTTCCTGGGATGGAATGGGCCTTTGGAAATTTACCAGCTGTGTGATCCTGAGCAAATTACAACTTTTCCAAGTCCCCTTTTCCTCTTCTACAAAATGCAGAGAGTACCTACTACTCTAACAAGGTTGTGAGGATTATGTTCGGCTTGTAAAGTGCTTAGCACAGTGATTAGGAAATTTTAAGCATTCGAAAACAGGTAGCTATTATTTTTACTCCCAACCTGAAAACATGGCTGCTCCTAATCATGGCTGTATGAGAAGACAACGAACTGAATTCATTAACACAATGGACAGGGTCCCATCCACCAGAGCTTCTCTGGCCTACCACATTCTACAAACAGTAAAAAACCCCATCAAACAACCTATTTCATTCCATCTAAAATGATCATCCCTTACACTATTAAAAAAAATTGAAATTGCTGCCATATGTGTTCTGCTACGTGTTGTACACTGAATATGTCAACTAGAAACTGCAGAGGAGTTTCTGGTTCCACCCAAAACATGAAGAACAAACGACGGGAAAACAAGAGTTAGATAAGGTCACACACCTGAACTGGAACTCAGGACCTTATTCTCAGCACATTTAGTTATTGATGGTAGAAAGCAAGATGCTTTAAGCACCATAAAGCAGGAATAATGGCTACCTCAACAAAATAAAATTCCCTAGGTTTCTCAACAAAATAGAATCCGTTTACTTTCCCTAGTAAATTCATACGCACATGGGCAATAATTGAGCATCTTTCTTGTAAAACTGAGGATTAAGCTGTGTCTGAGCTGCGATGTATGCTAAGATCATACAAATCTATGAAAAAGTCCTATGAAATGGCGAGACAAAATGATGGGGAATATCACCTTGAAATCAATAAGGAAGGAATCATAATGTCTTGCCAAAGTCGTCTCCGTTGATAAAATCCCGACCCTCCATTTCATGACACTGCACATCCACTTCCCTCTAAGGAAATTAACAGGAATCACCGCAGCAACCAACCTCAGCAATCAGCATCGACCAGATCAACCCCTACTCAAATGGCGATAAGCACTGCGGGTCCCAGCCCCTCCCTGTAACTGCCAACTCCCGCGCTGACTCCGGACTACCTCACTTCCTTTCAGTTCTCCCTCCCACTCTCGACCCACGATGTTTGTCCCAAAACTCGCCACTGACAGAAAAAACAGAAACAAGACAGTCTTCCTAAACTTTGTCCAGGAAGCACCAGGCTACACATACCCGTCCCTATTACGATCACATCAAACTCCGAAGGGAGAGTATCCGCCATCTTGACGGGAAACGTGTCATGTGACTATTACTTGTGGAAATGAGATCAAGTTAGGCATTCCCTCCGTGGAAGGCGGAAAAGAAAACAGTGCATTCTGGGTGTTGTAGTTCTTGGGTGGAAGGTCCAGGCAGGCGACTACAGTATTATATTTAAAATGTTTACACGTAAGTTAAAAGCTCGACTCATTTTCCAGGCTGAATGCTGTAGCTCATTTAGCCTGGAGGGTTGAAGTAGGAAAGAAATGTAGATTTTTATTCCGTTAGCTAGGATGAATGTGGGAGACTTGGATACCCCTTGACAATACTGAAGCAAAATTGTTAGGAGACACGGAGAGCGGAGTGAAATTGAAGGCGGTTTGCTCTGTGAACAAAGGAGACTTTCATTCCAGCCTAAAAAACTGTTTCCCGCCATCGTTTCTGTACTTTCGTGTTGGCCCTCGAGGCATGGTTTTTCACACCAATACTTTCAGAATATATTCCATATCTTTTTCGTTTCCATGTCTCCTCGATGAAAACCCACAAAAAACCTCTCACCCTAGTTTCGCCAGAATGGAACTCCCACCGCGTCCGTATTTGTTCCAGTTAGTCCAGACAGCCACTCGGGCAATAAATTAGCCTTGCTTTACAGGAGGGAAAAAATATATGAAAGCATGGTTGCGCTGGATAAGCGTCTGTTGAGTCTGAATTCGATTTTCCGCACCAACTAGATAAGTTCATGTAAGATTCTCCTTAGTTTACATTTAGTTTCCAATCTATTAGTTTCCATAAGCTACCGTAGAAATAATTAAGGTAAAATGGTCTTTATATTAAAGTGATTGTATCTGATGAGGGGCTTTAGG_CCTAAAGCCCCTCATCAGATACAATCACTTTAATATAAAGACCATTTTACCTTAATTATTTCTACGGTAGCTTATGGAAACTAATAGATTGGAAACTAAATGTAAACTAAGGAGAATCTTACATGAACTTATCTAGTTGGTGCGGAAAATCGAATTCAGACTCAACAGACGCTTATCCAGCGCAACCATGCTTTCATATATTTTTTCCCTCCTGTAAAGCAAGGCTAATTTATTGCCCGAGTGGCTGTCTGGACTAACTGGAACAAATACGGACGCGGTGGGAGTTCCATTCTGGCGAAACTAGGGTGAGAGGTTTTTTGTGGGTTTTCATCGAGGAGACATGGAAACGAAAAAGATATGGAATATATTCTGAAAGTATTGGTGTGAAAAACCATGCCTCGAGGGCCAACACGAAAGTACAGAAACGATGGCGGGAAACAGTTTTTTAGGCTGGAATGAAAGTCTCCTTTGTTCACAGAGCAAACCGCCTTCAATTTCACTCCGCTCTCCGTGTCTCCTAACAATTTTGCTTCAGTATTGTCAAGGGGTATCCAAGTCTCCCACATTCATCCTAGCTAACGGAATAAAAATCTACATTTCTTTCCTACTTCAACCCTCCAGGCTAAATGAGCTACAGCATTCAGCCTGGAAAATGAGTCGAGCTTTTAACTTACGTGTAAACATTTTAAATATAATACTGTAGTCGCCTGCCTGGACCTTCCACCCAAGAACTACAACACCCAGAATGCACTGTTTTCTTTTCCGCCTTCCACGGAGGGAATGCCTAACTTGATCTCATTTCCACAAGTAATAGTCACATGACACGTTTCCCGTCAAGATGGCGGATACTCTCCCTTCGGAGTTTGATGTGATCGTAATAGGGACGGGTATGTGTAGCCTGGTGCTTCCTGGACAAAGTTTAGGAAGACTGTCTTGTTTCTGTTTTTTCTGTCAGTGGCGAGTTTTGGGACAAACATCGTGGGTCGAGAGTGGGAGGGAGAACTGAAAGGAAGTGAGGTAGTCCGGAGTCAGCGCGGGAGTTGGCAGTTACAGGGAGGGGCTGGGACCCGCAGTGCTTATCGCCATTTGAGTAGGGGTTGATCTGGTCGATGCTGATTGCTGAGGTTGGTTGCTGCGGTGATTCCTGTTAATTTCCTTAGAGGGAAGTGGATGTGCAGTGTCATGAAATGGAGGGTCGGGATTTTATCAACGGAGACGACTTTGGCAAGACATTATGATTCCTTCCTTATTGATTTCAAGGTGATATTCCCCATCATTTTGTCTCGCCATTTCATAGGACTTTTTCATAGATTTGTATGATCTTAGCATACATCGCAGCTCAGACACAGCTTAATCCTCAGTTTTACAAGAAAGATGCTCAATTATTGCCCATGTGCGTATGAATTTACTAGGGAAAGTAAACGGATTCTATTTTGTTGAGAAACCTAGGGAATTTTATTTTGTTGAGGTAGCCATTATTCCTGCTTTATGGTGCTTAAAGCATCTTGCTTTCTACCATCAATAACTAAATGTGCTGAGAATAAGGTCCTGAGTTCCAGTTCAGGTGTGTGACCTTATCTAACTCTTGTTTTCCCGTCGTTTGTTCTTCATGTTTTGGGTGGAACCAGAAACTCCTCTGCAGTTTCTAGTTGACATATTCAGTGTACAACACGTAGCAGAACACATATGGCAGCAATTTCAATTTTTTTTAATAGTGTAAGGGATGATCATTTTAGATGGAATGAAATAGGTTGTTTGATGGGGTTTTTTACTGTTTGTAGAATGTGGTAGGCCAGAGAAGCTCTGGTGGATGGGACCCTGTCCATTGTGTTAATGAATTCAGTTCGTTGTCTTCTCATACAGCCATGATTAGGAGCAGCCATGTTTTCAGGTTGGGAGTAAAAATAATAGCTACCTGTTTTCGAATGCTTAAAATTTCCTAATCACTGTGCTAAGCACTTTACAAGCCGAACATAATCCTCACAACCTTGTTAGAGTAGTAGGTACTCTCTGCATTTTGTAGAAGAGGAAAAGGGGACTTGGAAAAGTTGTAATTTGCTCAGGATCACACAGCTGGTAAATTTCCAAAGGCCCATTCCATCCCAGGAAAACAGTCAGTGGTTCTGATCCTCAGAGTTGACCCATATTCGTAGAAACAATTGAGAAAACTTGGTTTTAGGTCTTGCTTAGTGCTAGTATGAATTGCTTGAATTGGGAGGGCCAGGCACTAAATCTCTCTGAGCCTCAGTTTCCTTGTTTTTGAATATAAGGAAGGTAAAATAGATGGTCTTTATGTTGTCTAACAAAAAATGCCTGGGTAACAGTTCTCAGAACAGCATGGACAGGTTTATTCTTAGCCTGAAAATAACTGCATGGCTGCAAATAACCTGGTTGTAGCTGCTATGAACAGATGTTGGATGAGAGCTGGAAAGAGCCAATGGACTATTCATTTTTTCTTTTAAGTCTATACTAAAACTGAGAACAGCATCCATTTTGGAATAGTTGTTATACCATACTTGAGTTTTATAATCTTGCTTTTTTTGTGTAGGAACTTGAGGTGTTTTTAACAATTTTTTTAATCAACAATTTTTTTTTTCAGGTCTGGGTTCAAATTTTAATTGATGAATGTTAATTCTTATGAATGATGCTAATGTTTTTCATAAATATTTTGCCTTCTGTACAGTTCAGAATTGGAAGACATCCATCCTGATGGAGTCTGTTATGGTAGAGGTAGGAAGTTACAGTGAGTGAGCCAGAGAATTGCATCAGGTGACATCGCTTATATTTGGAAAAATACTCATTTTATACTTATCCCAGTGAGTGCAATGTTATTTTTCAGTGTTTGGGAAGAACTTATTTGAAGTTAATCTTAAAACTTAAATTTTTCATATTGTTTATTGGTGACACATTCTAGGTTTTATCAAGCACATTGCTTAAATTTATGATATGTAATTTTGCTTTTAGGTACATCAAAAGTGTAAAAATTGAATAAAAGAGTCATTCCCCAAATAAATGATTCCAACAATTCCAATCATGATTTAATTTGATTTATTTTTTAGTGTGCTTGGGATATTTCTCTTGTGAAATAGTTTGCTGAAAGTAGCCTTTAAAATGTTTGCTTTATTAAATGTATTACATTGAGCAAATCATAGAAAAAGCAGTGTTTCATAGATTCAGTTGTATTAATAAAAGTAGAAGCAATAATGCATGCATCAGAAACATGAAGAAATAGGCTATTAGATATACTTTTTAGCAGATTTAGTCCTAGTTAAAATGTCATAAAGGAACAATTGCTTCATCAAGTATAATCATGGAAGATAATATTTTATAATCCCTTTCACCTTGATTAACTTGCATTATCATTTTCTGTAATGCAAGATGGAAATTGTTCCTTAAGATCATATTTTCGAGAGATTATGGAAAAAAATAAACTTTGTATTTTCTTAGTTAAGCCCATTTCCTTCTGATTAGAAAATACAGTAAATATCTCTTGAACATCCAAAGACAGAAGAGTAAGCTAAAGTGGAATTGCGGAAAAAATAGAAGAAAATGGAATAAAGAAGGAGGGCCAACTGGAGTCTCAAGGGAGAGGATAAAAATAGATATATGTGGTAAGGGTCTAGAAGGTGCTCTTCAGCTTGGGTACAGAATTGTCATTTTCAGGGTATTTTGCCATTTAAAAAGTCATTTAAATGACAGTTTTGTGGATTATAGACAAGTACCATTGGTTGAACGTTAAAAGTCGTCCCAGTAACATTACATCTCACTGGAGGTGACTTATCTGGAAGCCTGTGTAATCTGCAGCACAGCAGAGCCAAGAACTCAACAGAAGAGCAGTCAAGATGGAAGTTATAAGATCTATATGATAATATTGACATGCTTCATACATAGGGGAATCCCTGAGATCATCACTTAGAACATATTCATCCTCATTTTAAAAATAATTCTAGCAAGCTCAGCATATTAGAACCACCCCCTTAGAAGATACAAGGAATGTTATCAACTTCTGCAAAGACTATGTTTGGTGGGGTTGCTATTAGACATACAAATGACAGTTTGTATAGGAAAGGGAAGAGAAAAACACCTAAAATGCAAACTTACATCAAACATCACAAAGGCCTGATAACATTTAGTGTAGTGACAAATACTCATTTTTCAGGTCATTCAACCCACATTTACTGAGAACCTAGTCCCTTCCCTTGTGAAGTTTACTGTATTGCCTATTCTCAGGAACGTTCTGGCCTTCCTAACAGTCATATCCCCAAAGCTGTTCTTAAGACAATGTAGGCTTTTCATTGCATCCTTCTTTCTTTTTGAAGACCATGTAGACATTCAGCCTGCATTTTTTCTTTTTAAAAGCACACTGCTCACATCTTTTGGAGTTCTTCCCCTAAGATGGGCTGCAGAGTAGGTTAGCTTAATTTTTCTACTGCATAGGATAAACACATATATTCTTTTGATTTAAGAGTTTTATTGTTAGCCAAGCATGGTGGCACGGGCCTGTAATCCCCATTACTCTATAGGCTAAGGCACAGGCATCCCTTGATCCTGGAAGGCGGAGGTTGCAGTGAGCTTAGATGGCACCACTGCACTCTAGTCAGAGAAACAGAGTGAGACTGTCTCAAAAAAAAAAAAAAGTTTCATTGGCCAGGTGCAGTGGCTCAGGCCTGTAATCCCAGCACTTGGGGAAGCCGAGGCAGGTGGATTGCTTGAGCTCAGGAGTTTGAAACCTCAGCCTCCCAAGTAGCTGAAATGGGTGACACAGCAAAACTGTATTTCTACAAAAAAAATACAAAAATTAATTGGGTATGGTGACACGTGCCTGTAGTCCCAGCTACTTGGAAGGCTGAGGTGGGAGGATTGCCTGAAACTGGGAGTTTGAGGCTGCAGTGAGCTATAGTTGTGTCACTGCACTTCAGCCTGGGCAACAGAGCAAGACCCTATCTCAAAAAAAAAAAAAAGTTTTATTGCTAAAGATTATGGGGAAAGAGAAGGGATAGCAAAGATAAACATGGAAATTATAAGGAGTACTGCCCCTAAATCCTGATTATCTTGCTAGATAGTGTTTAGCACCATACAGTAGTAGATGTGGCAGCATGGTGTATGATAATCACATGACCATCCTATTGTGTGATTTTGGGAAAATTACAGAATTTCTTGTGACATCATTTAGTTCATCTGTAAAGTGAGAAATTTCAATAGATGATCTCCAAGGTAGGGATCTACCAGCCCTAAAATTACCATTGTATATATCTATTTATGACACCATGTCCTTGAGAAAAGGCTAAATTCTGTTCTGTGTAGTCTTTTTAAGAAGGAAAGTCAGTGTTTCATATACTTCAGAAATTTTCCTATAAAAATGGCTGATAGGGTTTAGATATTTGTCCTGCCCAAATCTCATGTTGAAATGTTATCCTCGTTGTCGGAGGTGGGGCCTGATGGGATGTGTTTGGGTCATGGGAGTGGATCCCTCATGGCTTGGTACTCTTCTTGTAATAGTGAATTCGTTCTTGCAGGATCTGGTGGTTTATTTTTTAATTTTTTTTTTGAGACAGGGTCTTGCTCTGTCACCCTAGTTGGAGTGCAGTGGCACAATCATGGCCCACTGCAGCCTTAACCTCCTGGGCCCAAGCAATCCTCCCACCTCAGACTCCTGAGTAGCTGAGATTACAGACATGCATCACCATGCCTGGCTAATTTTTTTTTTTTTATTTTTGTAGAGATGGGGTCTGCCTATGTTGCCCAGGCTGGTTTTGCACTTCTGAACTCAAGCAATCTACCTGCCTTGGTGGCCCAAATTGCTGGGATTACAGATGGGAGCCACTGTGCCTGGTCGACTGGTTGTTTAAAAGTGTGTGGCATCTCCCCACACTTGCTTGCCTTGCCTCTCACCAAGGGAGATGCCTGCTTCCCCTTTGCCTTCTGCTATGACTAAAAGCTCCCTCAGGCCTCCCTAGAAGCCGAGCAGATGCCTGTACCATGCTTGTACAGCCTGCAGAACCCTGAGCCAATTAAACCTCTTTTCTTTATAAATTACCCAGTTTCAGGCATTCCTTTATGGCAACACAGGAACAGCCTAATACAGTGACTGATTAGAAAAGGCATTTTCTTAAAAATGATAGGCTGCTTTTTTCAGAAAAGGTCACTTAAGTAGCCTTTCCTTGATGTTGGGATGTTCCAGTCCCATTCCTTGATGTTGGTGAATTTAAACATTAAGGTAGTTAACCTTGACTGGCAGCTCATTGACTTATGATGATGTTGGAATAGCATCACTTCTGTAAGCCTTGGGGGATAGTAATATTGCAGGAGGAGTTGTGAAAACTGATATTTACCAAATATCTGTTTGTGCCAGCCTTAACACAAGTGATTTCCTATATTTCTAACTTAATCTTCTCCCAGGATGCCTGAATGTGGT'

These extreme variants are raises issues from the UI point of view: I assume we cannot generate such long URLs. @carcruz can you confirm? An alternative option could be that for these extreme variants we can use HGVS identifiers on the frontend... not sure how that would work out. The HGVS identifiers are consistently present, and their lenght not too long:

+--------------------+----------+--------------------+-----------+-----------+
|           variantId|var_length|       variantHgvsId|hgvs_length|variantRsId|
+--------------------+----------+--------------------+-----------+-----------+
|12_32841995_AAAGA...|     17879|NC_000012.12:g.32...|         35|       null|
|11_116830247_GAAA...|     12136|NC_000011.10:g.11...|         37|       null|
|10_87954244_AAATT...|     11677|NC_000010.11:g.87...|         35|       null|
|X_48681722_CAGCAG...|     12656|NC_000023.11:g.48...|         35|       null|
|20_50899224_CTTGC...|     19111|NC_000020.11:g.50...|         35|       null|
|X_86041956_ACCACA...|     12842|NC_000023.11:g.86...|         35|       null|
+--------------------+----------+--------------------+-----------+-----------+
buniello commented 3 weeks ago

adding @carcruz to ticket to we can discuss this in meeting next week