LooseLab / readfish

CLI tool for flexible and fast adaptive sampling on ONT sequencers
https://looselab.github.io/readfish/
GNU General Public License v3.0
167 stars 31 forks source link

[Question] Testing Readfish on FAST5 file without playback #213

Closed harisankarsadasivan closed 1 year ago

harisankarsadasivan commented 1 year ago

Hello, I'd like to measure the performance of ReadFish on FAST5 classification (basecall+minimap2) into target vs non-targets. I do not wish to do any playback. Just want to measure accuracy and performance on my GPU. Is there a way to do this?

mattloose commented 1 year ago

There is no way to do this within ReadFish without interacting with MinKNOW. ReadFish cannot take a fast5 file as input.

You can engineer something yourself using the fast5 api and guppy. You would need to work on the benchmarking carefully to ensure that you are not timing file access times and instead are benchmarking just the basecalling and alignment step. We did something similar (looking at accuracy and chunksizes) in figure 1 of this preprint: https://www.biorxiv.org/content/10.1101/2021.12.01.470722v2.full.pdf

A close approximation might be to look at the batch sizes and times as reported by readfish - but this won't allow you to ascertain accuracy - it will merely look at performance.

This isnt' something we can support through ReadFish.