Closed themkdemiiir closed 8 months ago
Hi @themkdemiiir . I don't think there's currently a method can fully solve this for short read data. The call_non_ref_motifs.py script should be able to detect the main motifs present, but the ExpansionHunter allele size estimates will almost certainly be less accurate than for regular (single motif) STR loci. You may also want to try running STRling to see if it gives you better results for this locus https://github.com/quinlan-lab/STRling
There is also a problem I noticed in the expansionhunter. I could give whatever repeat unit I want (even some random characters), and it would give me the same result. So even though you select benign and pathogenic units, it could also be other de novo tandem repeats; what can be done in such situations?
It would be helpful to have a more specific description of the problem or an example, but I likely won't have a better answer than my previous one.
Hello! I thank you for the great tool that you've created. It's been really helpful. However, I have questions regarding multiallelic short tandem repeats on ExpansionHunter. I found that EH doesn't support multiallelic repeats, and I'd like to know how we can decide which allele is in the pathogenic range in such cases. To illustrate, let's assume that there are two polymorphic repeat units: AAT (benign) and AAC (pathogenic). Suppose that the tool finds repeat counts of 300 and 10, but we need to know which is which. Therefore, we can't determine whether AAT is the 300 repeat count or AAC, which makes it difficult to determine if it's pathogenic or not.
You mention here the workaround but I don't think it helps to find all the motifs. "If this option is specified, this script will run ExpansionHunter once for each of the motif(s) it detects at the locus. ExpansionHunter doesn't currently support genotyping multiallelic repeats such as RFC1 where an individual may have 2 alleles with motifs that differ from each other (and from the reference motif). Running ExpansionHunter separately for each motif provides a workaround."