nanoporetech / dorado

Oxford Nanopore's Basecaller
https://nanoporetech.com/
Other
508 stars 62 forks source link

Feature request: auto-detect model #349

Closed billytcl closed 8 months ago

billytcl commented 1 year ago

MinKNOW records a bunch of metadata about the run (eg. pore speed, pore type, ligation/rapid, sampling rate, etc) that should really be encoded into pod5s, could be encoded into pod5s during the fast5 conversion process, or could be inferred by dorado itself.

It would really reduce a ton of user friction for using these tools if you could just run the tool (or dorado server) by just typing something like:

dorado basecaller --mods 5mc --mode sup --align_ref hs38.mmi reads.pod5 > reads.bam

Even things like finding the flowcell/kit name is a problem for bioinformaticians who don't have much connection to the wet lab, or even for downloading public datasets from NCBI's SRA.

There have been many times when the wrong basecaller model was used just from typos or confusion/miscommunication between wet/dry labs and it's just a huge drag on time. Having basecallers be idiot proof for model selection would be huge!

vellamike commented 1 year ago

Thanks for raising this. This feature is indeed something we will be adding to Dorado soon.

HalfPhoton commented 8 months ago

Auto model selection has been implemented in 0.5.0