LooseLab / readfish

CLI tool for flexible and fast adaptive sampling on ONT sequencers
https://looselab.github.io/readfish/
GNU General Public License v3.0
163 stars 31 forks source link

Feature/duplex #324

Closed Adoni5 closed 2 months ago

Adoni5 commented 5 months ago

Two implementations of simple duplex targeting

Both of these modes break the chain of stop_receiving if the previous read was only sequenced if it was potentially duplex, by checking the new duplex_override decision, on the duplex_tracker.

Other things of note:

  1. Version is included in the printed output at the top of the logs
  2. Small change to an error in mappy.py index extension checking, which not includes the extension that is incorrect for clearer error messaging
  3. Added a new decision of when we override a read at the start of a readfish run, if the translocated portion is of unknown length
mattloose commented 5 months ago

It has been tested live.... waiting to interpret those tests.

Adoni5 commented 4 months ago

Checked the simple duplex

Of 220,000 Duplex reads after 4 days of running - we sent unblocks to 42,886 to the one or both of these reads parents, (29051 reads had an unblock sent to one of the parents, 13835 to both of the parents).

However looking at this by read lengths it's probably fine - image

And over the whole width of read lengths: image

Note - 10 base bin width for both plots.

It's roughly 0.7206% of total duplex bases, so I think this is working!

To avoid this languishing, I'm suggesting we merge this in and mark it as highly experimental.

Adoni5 commented 2 months ago

This works - @alexomics and @mattloose I'd like to merge so we can address #347 and avoid this languishing so long it no longer works

mattloose commented 2 months ago

Yes - we know that this works now. I'm happy to merge it.

Adoni5 commented 2 months ago

@alexomics blocked by your requested change for being certain about Strand which has been addressed!