Motif length variation - Githubissues

Illumina / ExpansionHunterDenovo

A suite of tools for detecting expansions of short tandem repeats

Other

78 stars 25 forks source link

Motif length variation #34

Open kumara3 opened 4 years ago

kumara3 commented 4 years ago

Hello,

Thank you for all the series of expansion hunter tools. Great work!

I have a question about motif length. As per the methods section of this publication, motif length of 2-20 bp is searched across the genome. Did you also try the search of motif length > 20bp. How computationally intensive the jobs will be for such types of search?

Please let me know.

Regards, Ashwani

egor-dolzhenko commented 4 years ago

Thanks for the question! Detecting longer motifs tends to be more error-prone. This is because a read can span relatively few motifs longer than 20bp and hence a small number of impurities in the repeat sequence can cause motif to be misidentified. We are continuing to work on improving motif identification algorithm so the future versions of EHdn might be able to handle longer motifs.

Best wishes, Egor

kumara3 commented 4 years ago

thank you!