Why is detecting and genotyping Short Tandem Repeats (STRs) challenging?

bcgsc / straglr

Tandem repeat expansion detection or genotyping from long-read alignments

Other

69 stars 8 forks source link

Hi, Thank you for developing the excellent straglr tool. I've read your paper "Straglr: discovering and genotyping tandem repeat expansions using whole genome long-read sequences". I'm still confused by the following questions:

straglr requires specifying the parameter --loci: a BED file containing loci to be genotyped. Since we know the structure and location of motifs on the reference genome, what are the challenges in detecting motifs and their repeat counts in reads?
How can we evaluate the performance between various STR detection and genotyping tools? Is there a gold standard dataset available to evaluate the sensitivity and precision of different tools?

bcgsc / straglr

Why is detecting and genotyping Short Tandem Repeats (STRs) challenging? #29