Illumina / ExpansionHunter

A tool for estimating repeat sizes
Other
183 stars 51 forks source link

To add gene ARX to variant catalog #112

Closed QuanLG closed 3 years ago

QuanLG commented 3 years ago

Hi, I want to add gene ARX in the variant catalog,the repeat region are chrX:25031766-25031814( repeat unit is NGC ) and chrX:25031646-25031682( repeat unit is NGC ), but there's a long sequence between two repeat region( log sequence:ccctgcgccgtccggccgttccccgggccgcgcggTTGGCGGTGGCGGCGGAGGGGCCTCCCCGCGTGGACccgccgtggccgt ). So could you help me to add the gene ? Thanks in advance

dnil commented 3 years ago

I will let the maintainers answer, but if you wish you could look at the config we use in Stockholm: https://github.com/moonso/stranger/blob/master/stranger/resources/variant_catalog_grch37.json

I decided on making that two different entries for now since they as you say are far apart, and also give rise to different disorders, with different pathological size ranges.

egor-dolzhenko commented 3 years ago

Great question @QuanLG. Thank you for providing the config @dnil! We can run some tests on our end to see which definition works best. @yjqiu, would you be interested in helping with this?

yjqiu commented 3 years ago

@dnil Thanks for sharing the annotation! We have tested the annotation of ARX repeat and the annotation looks great. @QuanLG you can use annotation from stranger now and we will work to add it to our catalog later.

QuanLG commented 3 years ago

OK,thanks all

serge2016 commented 3 years ago

Hello! Any progress here?

yjqiu commented 3 years ago

@serge2016 Thanks for the interest. As I mentioned above, we have tested the annotation provided by @dnil from Stranger and it looks good.

To be more specific, you can either encode two repeats as a single repeat with two units or two separate repeats. We tested both versions and they all perform well.

Annotation as a single repeat in hg38

{ "LocusId": "ARX", "LocusStructure": "(NGC)CCCTGCGCCGTCCGGCCGTTCCCCGGGCCGCGCGGTTGGCGGTGGCGGCGGAGGGGCCTCCCCGCGTGGACCCGCCGTGGCCGTG(NGC)", "ReferenceRegion": [ "chrX:25013530-25013565", "chrX:25013650-25013697" ], "VariantId": [ "ARX_PRTS", "ARX_EIEE" ], "VariantType": [ "Repeat", "Repeat" ] }

Annotation as two repeats in hg38 as provided by @dnil in hg19 and liftovered to hg38

{ "VariantType": "Repeat", "LocusId": "ARX_EIEE", "HGNCId": 18060, "InheritanceMode": "XR", "DisplayRU": "GCN", "SourceDisplay": "GeneReviews Internet 2019-11-07", "Source": "GeneReviews", "SourceId": "NBK535148", "LocusStructure": "(NGC)", "ReferenceRegion": "chrX:25013650-25013697", "Disease": "EIEE", "NormalMax": 16, "PathologicMin": 17 }, { "VariantType": "Repeat", "LocusId": "ARX_PRTS", "HGNCId": 18060, "InheritanceMode": "XR", "DisplayRU": "GCN", "SourceDisplay": "GeneReviews Internet 2019-11-07", "Source": "GeneReviews", "SourceId": "NBK535148", "LocusStructure": "(NGC)", "ReferenceRegion": "chrX:25013530-25013565", "Disease": "PRTS", "NormalMax": 12, "PathologicMin": 20 }

serge2016 commented 3 years ago

@yjqiu Спасибо