nanoporetech / dorado

Oxford Nanopore's Basecaller
https://nanoporetech.com/
Other
450 stars 53 forks source link

Foldback chimeras #590

Open itslittman opened 5 months ago

itslittman commented 5 months ago

Is there a way to get rid of duplex foldback chimeras? It doesn't seem like Dorado splits the + and - strand of duplex reads into two reads. I mean I suppose technically they're the same read, but that shouldn't be denoted by making them look chimeric.

My work involves actual chimeric reads (fusion genes), so all these extra ones make it much more tedious to sift through when trying to confirm genomic breakpoints for fusions I find in my dRNA data. Thanks!

vellamike commented 5 months ago

Hi @itslittman

Could you give me a small example of what you mean? Dorado does split the + and - strand into two reads so it would be helpful to have an example with some short "mock" reads of what the issue that you're seeing is.

itslittman commented 5 months ago

@vellamike this:

image
wilsonte-umich commented 5 months ago

I believe this post is asking about my query in #443, which wasn't answered. From my examples, I think some foldback chimeras (where the two strands of the same duplex are sequenced as one initial read) are missed if they aren't splittable at the internal junction.

Thus, the following are processed as duplex:

read1 >>>>>>>> read2 <<<<<<<< read1 >>>>>>>>||<<<<<<<< (where || is a splittable junction)

But the following is not, although occurs with some frequency:

read1 >>>>>>>>//<<<<<<<< (where // is a non-splittable junction)