mcfrith / tandem-genotypes

GNU General Public License v3.0
45 stars 7 forks source link

different repeats at same location #6

Closed ghost closed 4 years ago

ghost commented 4 years ago

Hello, I noticed that the copy number counts are exactly same for two different repeats at same location. For instance, if two types of repeat acgttt and ccgtt at same location, the two line outputs are same. I wonder how does the number of repeats are calculated? If the exact match of the repeat sequence is required?

mcfrith commented 4 years ago

Hello! Exact match is not required, because it tries to allow for high rates of sequencing error. The predicted copy number counts depend on the length of the repeat-unit, but not on its sequence. The full details are in the paper mentioned at the end of the README.

ghost commented 4 years ago

If I understand correct, that is to say, it can not tell apart two types of repeat in same loci. Thank you.

mcfrith commented 4 years ago

If you'd like to examine the type of repeat (i.e. its sequence) in your data, tandem-genotypes-merge may be useful.