Closed Jesson-mark closed 2 years ago
Thanks for reporting @Jesson-mark One thing I added in v1.2 is to increase the stringency of checking if tandem repeat does occupy most of the "novel" sequence (the expansion sequence if you're doing a genome-scan or the sequence sandwiched between the 2 coordinates provided in genotyping mode). The reason of doing this is to remove cases of retrotransposon insertion (which sometimes consist of tandem repeats flanked by non-repeat sequences) from true repeat expansion cases. I suspect the two reads that got filtered out may have some stretches of sequences, usually near the ends, that are not repeats. You can check the sub-sequences of the 2 reads using the start coordinate and size information from the old result and see if this is the case. I'm more than happy to debug this too if you send me the data (like a bam file of just the locus in question).
And sorry for the delayed response as I'm still away from work. And thanks for reporting this. As there are quite a few things changed, I want to see the effect on others' data before I officially tag it as a new version.
Thanks for your reply! Sorry for bothering you when you are away from work.
I will follow your suggestion to check what is happening to those 2 reads. If there is any progress or problem I will let you know.
Best wishes!
Hi, I'm now using
straglr
to analyze a tandem repeat. BeforeI have installedstraglr
version 1.1.1 and now I downloaded the newest source codes in the zip format. The newest version is 1.2.0. I ran these two versions to analyze my data and found different results. My motif is CGG and reference copy number is 11. The parameters I specified are--max_str_len 50 --min_str_len 2
and others are default. Specifically, the old version ofstraglr
found 7 reads whose motif copy number is as below:And the motif number of clustered allele is 894.8(2);169.5(5). While these numbers are true accoding to my manual inspection, the cluster result is not ideal.
The newer version of
straglr
found 8 reads whose motif copy number is:And the motif number of clustered allele is 16.2(7). You can see that reads that whose copy number are 946.7 and 843.0 are not reported in newer
straglr
and newerstraglr
found some new reads that olderstraglr
didn't find.I don't understand why there is such difference. Could you explain it?
Thanks!