gymrek-lab / LongTR

Tandem repeat genotyping with long reads
GNU General Public License v2.0
19 stars 0 forks source link

"Aborting genotyping of the locus as the sequence upstream/downstream of the repeat is too repetitive for accurate genotyping" #6

Closed bw2 closed 2 months ago

bw2 commented 2 months ago

I'm testing LongTR and finding it to be very accurate on the loci it genotypes.

image

However, it only genotypes ~83% of the loci in my input repeats bed file (which I generated by identifying all polymorphic TR loci in my sample within well-behaved regions of the genome). The other 17% are left as no-call due to this error: "Aborting genotyping of the locus as the sequence upstream/downstream of the repeat is too repetitive for accurate genotyping". I understand this is carried over from HipSTR, but given the much longer reads now, would it be possible to convert this to a warning or flag and genotype all loci that have sufficient coverage?

heliziii commented 2 months ago

Hi Ben,

Thank you very much for using LongTR. Did you add --skip-assembly parameter when running LongTR? Adding this should skip the step where LongTR tries to assemble the flanking sequence, which as you mentioned is not necessary in most cases with long reads. Please let me know if the issue is not fixed after this change.

bw2 commented 2 months ago

Thanks @heliziii that fixed it!
Great accuracy even with --skip-assembly:

image