kantale / MutationInfo

Tool to retrieve location information of genetic variants
MIT License
6 stars 3 forks source link

Errors #3

Open ekartsak opened 8 years ago

ekartsak commented 8 years ago

On REF/ALT alleles

Source : VEP rs758320086 rs72549351 rs730882170

Source : UCSC rs3832694 rs72466463 [ has merged into rs587778873 ] rs5030656 rs72549380 rs8175347 !! rs3215983 rs28399445 rs267607275

Source : NC_transcript NC_000022.11:g.42132048_42132049insTT NC_000022.11:g.42132048_42132049insT NC_000022.11:g.42132048_42132049delTT

ekartsak commented 8 years ago

After LiftOver : common offsets

Offset on GRCH38 -> (liftOver)Offset on GRCH37 : Variant(GRCH38) : Variant(GRCH37) 42126914 -> 42522916 : NC_000022.11:g.42126914C>G : rs149157808 C>T 42128217 -> 42524219 : NC_000022.11:g.42128217_42128218insG : rs148769737 G>T

rs148769737 in dbSNP appears to be biallelic. G>T, G>A On our JSON file is only G>T according to UCSC

ekartsak commented 8 years ago

Duplicated Offsets that correspond to different variants

According to GRCH37 Offset : Gene : Variant : Source 41352258 : CYP2A6 : 4074delA : BLAT 41352258 : CYP2A6 : 4071delA : BLAT 41352954 : CYP2A6 : 3378C>T : BLAT 41352954 : CYP2A6 : 657C>T : BLAT 41354553 : CYP2A6 : 459G>A : BLAT 41354553 : CYP2A6 : 1779G>A : BLAT 41354851 : CYP2A6 : 1471_1476delCTCTCT : BLAT 41354851 : CYP2A6 : 1481_1486delCTCTCT : BLAT 234676941 : UGT1A1 : 1160(CC>GT) : BLAT 234676941 : UGT1A1 : 1159(C>T) : BLAT 153762634 : G6PD : 561_563delCTC : NC_transcript 153762634 : G6PD : rs5030868 : UCSC 153764234 : G6PD : 185C->A : NC_transcript 153764234 : G6PD : 185C->T/A : NC_transcript

rs72558184 doesn't map to any assembly 96535210 : CYP2C19 : rs72552267 : UCSC 96535210 : CYP2C9 : rs72558184 : NC_transcript

rs144041067 hgvs name : NM_000367.3:c.488G>A 18139200 : TPMT : NM_000367.2:c.488G>C : NC_transcript 18139200 : TPMT : rs144041067 : UCSC

From cypalleles rs4986909 is 22026C>T, which could be mistaken with 26206C>A? 99359670 : CYP3A4 : rs4986909 : UCSC 99359670 : CYP3A4 : 26206C>A : UCSC

The following seem to be biallelic but are reported as two different variants on json 41350615 : CYP2A6 : 5717C>G : BLAT 41350615 : CYP2A6 : 5717C>T : BLAT

234676506 : UGT1A1 : 1007(G>A) : BLAT 234676506 : UGT1A1 : 1007(G>T) : BLAT

According to GRCH38 Offset : Gene : Variant : Source 42522660 : CYP2D6 : NC_000022.11:g.42126658_42126666dupAGTGGGCAC : NC_transcript 42522660 : CYP2D6 : NC_000022.11:g.42126658A>G : NC_transcript

ekartsak commented 8 years ago

Update dbSNP version (current 142) : 08 January 2016 - dbSNP 144 Available for hg19 and hg38

variants appear to have 3 alternative alleles instead of 2 (errors probably due to the dbSNP version) rs55886062 rs55752064 rs55640102 rs7900194 rs56165452 rs55901008 rs56387224 rs56199088 rs56240201 rs56107638 rs1799966 rs4986850 rs4986891 rs35303484 rs45482602 rs55989760 rs55771538 rs72549387 rs5030865 rs55918055 rs34059508 rs4986909 rs4986908 rs55951658 rs137852328

kantale commented 8 years ago

Insertion: rs1799752. FIXED: 0a3a797

kantale commented 8 years ago

41352258 : CYP2A6 : 4074delA : BLAT 41352258 : CYP2A6 : 4071delA : BLAT

These are the same. http://www.cypalleles.ki.se/cyp2a6.htm https://mutalyzer.nl/name-checker?description=NG_008377.1%3Ag.9092delA

41352954 : CYP2A6 : 3378C>T : BLAT 41352954 : CYP2A6 : 657C>T : BLAT

NG_008377.1:g.8399C>T NG_008377.1:c.657C>T

41354553 : CYP2A6 : 459G>A : BLAT 41354553 : CYP2A6 : 1779G>A : BLAT

NG_008377.1:c.459G>A NG_008377.1:g.6800G>A

41354851 : CYP2A6 : 1471_1476delCTCTCT : BLAT 41354851 : CYP2A6 : 1481_1486delCTCTCT : BLAT

NG_008377.1:g.6502_6507delCTCTCT NG_008377.1:g.6502_6507delCTCTCT FIXED: 9d3299c