Closed apredeus closed 3 years ago
Hello, The HipSTR reference contains more imperfect repeats, as HipSTR is capable of genotyping those as well. GangSTR reference was further refined to only include perfect repeats (no interruption between copies of the motif, and no mutation inside the repeat). Both reference sets also have other filters to weed out complex regions that are prone to error, and those filters are not necessarily the same. Another point is that GangSTR reference includes longer motifs (up to 20bp), but HipSTR only includes up to 6bp.
They both originate from trf outputs, but the filtering steps are very different. Please let me know if you have any other questions. Best, Nima
Thank you very much - very clear and informative!
Hello,
I'm trying several tools mentioned in
trtools
, and am curious about the following observation:GangSTR
lists 829,231 repeats spanning ~ 12 Mb;HipSTR
lists 1,620,030 repeats spanning ~ 41 Mb.I think they are supposed to be generated using the same algorithm (
trf
). Do you know why is there such difference?Thank you in advance!