adamewing / tldr

Identify and annotate TE-mediated insertions in long-read sequence data
MIT License
40 stars 4 forks source link

min read support #19

Open daggoo opened 3 years ago

daggoo commented 3 years ago

Hi,

I am looking into low allele frequency insertions in Drosophila and I am wondering how TLDR behaves with lower than 3 min read support? It looks like everything works fine if I set --minread 1 except that there are around 10% insertions with read support 1 that were identified by a semi-manual approach that were not reported by TLDR. I think this could be because of the different parameters for sequence similarity but may be there are other caveats that I should consider.

Thanks!

adamewing commented 3 years ago

Hey there, yes, it works fine at --minread 1 but that one read is doing a lot of work i.e. has to completely contain the insertion, has to have few enough errors to support alignment to the reference genome and to the TE reference (unless using --elts none). If you'd like to send single reads containing an insertion that were missed by TLDR (along with the te reference) I'm happy to have a look.

daggoo commented 3 years ago

Thanks for your reply! It's great that TLDR can be used with 1 read support. I'll prepare the reads containing the insertions and send them to you.