giesselmann / STRique

Nanopore raw signal repeat detection pipeline
MIT License
45 stars 10 forks source link

Distinguish between AAAAT and GAAAT #20

Open TheresaLueth opened 4 years ago

TheresaLueth commented 4 years ago

Hello,

We are using your tool to investigate a repeat expansion with the motif AAAAT. We assumed that there might be a mutation and it could also have the motif GAAAT. Looking at the alignment we saw that only 3% of the reads have GAAAT. When we counted the repeats with STRique using GAAAT and AAAAT as a motif in two separate config files the results were the same. Even the number of evaluated reads using GAAAT as a motif was the same, although there should be just a small fraction of reads with this motive. Is it possible that STRique doesn't distinguish between AAAAT and GAAAT?

Thank you for your help!

Best wishes, Theresa

giesselmann commented 4 years ago

Hey,

Yes! The count is based on the best match of sequence to signal. You could simulate the signal from the pore-model in the models folder. I assume the squiggle would look very similar. I have idea's for an option to test different repeat sequences, the output would be the count plus sth. like the methylation string (0010011...) for the configured alternatives. This doesn't have a time frame yet though.

Pay

TheresaLueth commented 4 years ago

Good to know. Thank you for your quick answer!

Best wishes, Theresa