Illumina / ExpansionHunter

A tool for estimating repeat sizes
Other
177 stars 51 forks source link

Request to add SAMD12 (Ishiura et al 2018) #97

Open rcb2011 opened 4 years ago

rcb2011 commented 4 years ago

Hi, I am planning to run the complex repeats associated with FAME, kindly help me to create the json file for SAMD12 gene containing the TTTTA and TTTCA repeats. Thanks

egor-dolzhenko commented 4 years ago

Thank you for the request! We are currently working on assessing our ability to call these types of expansions. Do you by any chance have samples that contain FAME expansions? If yes, would you be interested in a collaboration?

If this would be more convenient, please feel free to reach out by email: edolzhenko@illumina.com

Best wishes, Egor

fjmuzengyiheng commented 3 years ago

Thank you for the request! We are currently working on assessing our ability to call these types of expansions. Do you by any chance have samples that contain FAME expansions? If yes, would you be interested in a collaboration?

If this would be more convenient, please feel free to reach out by email: edolzhenko@illumina.com

Best wishes, Egor


Hi, Egor, I have a FAME1 WGS data containing SAMD12 exp+ at hand. As far as I know, the expansion form of SAMD12 is (TGAAA)n insertion between AluSq2 tail and (TAAAA)n, so I add it into grch38 catalog json file as below:

{   
    "VariantType": "Repeat",
    "LocusId": "SAMD12",
    "LocusStructure": "(TGAAA)*(TAAAA)*",
    "ReferenceRegion": [
        "8:118366815-118366920",
        "8:118366820-118366918"
    ],  
    "VariantId": [
        "SAMD12_TGAAA",
        "SAMD12_TAAAA"
    ],  
    "VariantType": [
        "Repeat",
        "Repeat"  
    ]   
}   

Could you please help check if I write it in the right way? Thank you so much!

katerinaoleynikova commented 3 years ago

Thank you for the request! We are currently working on assessing our ability to call these types of expansions. Do you by any chance have samples that contain FAME expansions? If yes, would you be interested in a collaboration? If this would be more convenient, please feel free to reach out by email: edolzhenko@illumina.com Best wishes, Egor

Hi, Egor, I have a FAME1 WGS data containing SAMD12 exp+ at hand. As far as I know, the expansion form of SAMD12 is (TGAAA)n insertion between AluSq2 tail and (TAAAA)n, so I add it into grch38 catalog json file as below:

{   
    "VariantType": "Repeat",
    "LocusId": "SAMD12",
    "LocusStructure": "(TGAAA)*(TAAAA)*",
    "ReferenceRegion": [
        "8:118366815-118366920",
        "8:118366820-118366918"
    ],  
    "VariantId": [
        "SAMD12_TGAAA",
        "SAMD12_TAAAA"
    ],  
    "VariantType": [
        "Repeat",
        "Repeat"  
    ]   
}   

Could you please help check if I write it in the right way? Thank you so much!

Hey!

As far as I understood, @dnil 've made a tool specialized on EH results' annotating. And here it is already done: https://github.com/moonso/stranger/blob/master/stranger/resources/variant_catalog_hg38.json

dnil commented 3 years ago

Feel free to use the Stranger annotations! They include some genes that are not yet tested by Egor and co. However do know that the Illumina team do so much more evaluations for the loci they include. Consider the Stranger ones candidates, with some extra annotation for disease status, literature references and so on. Stranger tries to update when there are news from ExpansionHunter!😊

egor-dolzhenko commented 3 years ago

@fjmuzengyiheng: Thanks for the question! Could you please visualize alignments of reads to this repeat with REViewer? This will tell us if this repeat definition is appropriate for your sample. Also, using Stranger annotations is a great idea, as Daniel (@dnil) suggested.

I suspect it would be difficult to create a catalog entry for this repeat that would work for everyone because the sequence composition of this repeat can differ so much from person to person. Perhaps @mfbennett could comment on this?

egor-dolzhenko commented 3 years ago

We will likely need to make some code changes to add proper support for this repeat. @fjmuzengyiheng, would you be interested to contribute to this effort? The WGS data with SAMD12 expansion that you have would be extremely useful. If you are interested, perhaps we could discuss the possibility of data sharing by email (edolzhenko@illumina.com)?

fjmuzengyiheng commented 3 years ago

Thank you for the request! We are currently working on assessing our ability to call these types of expansions. Do you by any chance have samples that contain FAME expansions? If yes, would you be interested in a collaboration? If this would be more convenient, please feel free to reach out by email: edolzhenko@illumina.com Best wishes, Egor

Hi, Egor, I have a FAME1 WGS data containing SAMD12 exp+ at hand. As far as I know, the expansion form of SAMD12 is (TGAAA)n insertion between AluSq2 tail and (TAAAA)n, so I add it into grch38 catalog json file as below:

{   
    "VariantType": "Repeat",
    "LocusId": "SAMD12",
    "LocusStructure": "(TGAAA)*(TAAAA)*",
    "ReferenceRegion": [
        "8:118366815-118366920",
        "8:118366820-118366918"
    ],  
    "VariantId": [
        "SAMD12_TGAAA",
        "SAMD12_TAAAA"
    ],  
    "VariantType": [
        "Repeat",
        "Repeat"  
    ]   
}   

Could you please help check if I write it in the right way? Thank you so much!

Hey!

As far as I understood, @dnil 've made a tool specialized on EH results' annotating. And here it is already done: https://github.com/moonso/stranger/blob/master/stranger/resources/variant_catalog_hg38.json

Thank you so much for this useful information. I will try it right now. : - )

fjmuzengyiheng commented 3 years ago

Feel free to use the Stranger annotations! They include some genes that are not yet tested by Egor and co. However do know that the Illumina team do so much more evaluations for the loci they include. Consider the Stranger ones candidates, with some extra annotation for disease status, literature references and so on. Stranger tries to update when there are news from ExpansionHunter!😊

Thank you so much for this tool. I will try it right now. : - )