Closed johnstonmj closed 6 months ago
Hi @johnstonmj, If a stereo model is in the same directory as the simplex model then it will be re-used.
# Previously downloaded both models
$ ls local_models/
dna_r10.4.1_e8.2_5khz_stereo@v1.2 dna_r10.4.1_e8.2_400bps_hac@v4.3.0
# Run duplex
$ dorado duplex ./local_models/dna_r10.4.1_e8.2_400bps_hac@v4.3.0/ tests/data/duplex/pod5/duplex.pod5 > tmp.bam
[2024-03-07 20:35:50.062] [info] > No duplex pairs file provided, pairing will be performed automatically
... # No download here
[2024-03-07 20:36:18.704] [info] > Basecalled @ Bases/s: 6.575190e+03
# Rename stereo model with .bak suffix
$ mv local_models/dna_r10.4.1_e8.2_5khz_stereo@v1.2 local_models/dna_r10.4.1_e8.2_5khz_stereo@v1.2.bak
$ ls local_models/
dna_r10.4.1_e8.2_5khz_stereo@v1.2.bak dna_r10.4.1_e8.2_400bps_hac@v4.3.0
# Downloads model
$ dorado duplex ./local_models/dna_r10.4.1_e8.2_400bps_hac@v4.3.0/ tests/data/duplex/pod5/duplex.pod5 > tmp.bam
[2024-03-07 20:38:18.092] [info] > No duplex pairs file provided, pairing will be performed automatically
[2024-03-07 20:38:18.094] [info] Assuming cert location is /etc/ssl/cert.pem
[2024-03-07 20:38:18.095] [info] - downloading dna_r10.4.1_e8.2_5khz_stereo@v1.2 with foundation
...
Perfect! Exactly what I wanted. Thanks @HalfPhoton
Issue Report
Please describe the issue:
I'd like to be able to explicitly set and use a local copy of the duplex model used by dorado duplex. I am collecting many directories of .temp_dorado_model-123456 as each invocation downloads the required duplex model. I have downloaded and can specify the required simplex model. I cannot find the option to specify the duplex model documented, but I would like the same functionality.
Ideally:
dorado duplex --simplex_model ./models/dna_r10.4.1_e8.2_400bps_hac@v4.3.0 --duplex_model ./models/dna_r10.4.1_e8.2_5khz_stereo@v1.2 some_input.pod5 > some_output.bam
Use of a local copy of the duplex model would prevent re-downloading the same file on each invocation. This would speed each run, save on storage space, and reduce internet bandwidth.
Steps to reproduce the issue:
Running the command: dorado duplex dna_r10.4.1_e8.2_400bps_hac@v4.3.0 some_input.pod5 > some_output.bam
Begins with a download of: downloading dna_r10.4.1_e8.2_5khz_stereo@v1.2 with httplib
Run environment:
Logs
N/A