Open pclavell opened 8 months ago
Hi @pclavell - thanks for raising the issue.
it does not reach 100% sensitivity
What kind of sensitivity are you seeing?
The DNA read splitting code is here. Dorado searches for an adapter that's close to what looks like the start of a new read. It may be possible to integrate information about additional adapters, but so far we haven't found the need.
What model is your data called with?
Hi @pclavell any updates on this?
It turns out the information I had was based on an analysis of data basecalled with Guppy having >15% chimeric reads (detected thanks to internal PCR primers). However, I wonder if you have run any test to check the efficiency of Dorado ReadSplitter by leveraging internal adapters. Because now I am sequencing new data with internal adapters that I could use to further improve the splitting. Sorry for the delay.
I finally finished my analysis and I get around 2% concatamers/chimeric reads, some of them being formed by more than 2 reads.
Thank you for the analysis and feedback! We will look into how to improve read splitting. It would be great if you can share a few of the unspilt reads in a pod5 so we can evaluate any improvements.
I can not at the moment because it is data generated with an unpublished protocol but it is basically the cDNA with one 56 nt long adapter different in each end. Then ONT library is prepared with that material.
@pclavell, any updates on this?
Apparently, dorado v0.4 and newer versions have added a chimeric read splitting feature. However, it does not reach 100% sensitivity. I was wondering how is exactly the read splitting working. Assuming that it looks for ONT adapters in the middle of reads, chimera detection sensitivity could potentatially be improved if a specific library has additional adapters (like in my experimental setup where my library looks like ONT adapter - 5' linker - cDNA - 3' linker - ONT adapter. I could not find information about this in the documentation. Thanks a lot.