Closed stevebaeyen closed 1 month ago
Hi @stevebaeyen - running dorado correct
is CPU/host memory and GPU memory intensive. As suggested here we recommend running on a beefier system to get reasonable performance.
You can run it on a smaller system by placing around with -b
(batch size) and -i
(mapping index size) through the cmdline. e.g. you can try -i 800M -b 2
and see if that works.
Thanks @tijyojwad ! That is working!
Issue Report
Please describe the issue:
ran dorado 0.7.0 basecaller on R10.4.1 simplex data with v5 models and tried to correct the reads using herro, but abort due to unsufficient memory
Steps to reproduce the issue:
dorado correct GBBC_502_supv5.fq > GBBC502_corr.fasta [2024-05-25 09:51:53.419] [info] Running: "correct" "GBBC_502_supv5.fq" [2024-05-25 09:51:53.420] [info] - downloading herro-v1 with httplib terminate called after throwing an instance of 'std::runtime_error' what(): Insufficient memory to run inference on cuda:0 Aborted (core dumped)
Run environment:
Dorado version: 0.7.0+71cc744+cu11080
Dorado command: dorado correct GBBC_502_supv5.fq > GBBC502_corr.fasta
Operating system: Ubuntu 20.04.6 LTS
Hardware (CPUs, Memory, GPUs): 8 CPU (i7 Intel), 16Gb RAM, GPU Nvidia GeForce RTX 2080 with 8Gb memory
Source data type (e.g., pod5 or fast5 - please note we always recommend converting to pod5 for optimal basecalling performance): pod5 -> fq after basecalling
Source data location (on device or networked drive - NFS, etc.): on device
Details about data (flow cell, kit, read lengths, number of reads, total dataset size in MB/GB/TB): 1.5 Gb .fq
Dataset to reproduce, if applicable (small subset of data to share as a pod5 to reproduce the issue):
Logs
Please provide output trace of dorado (run dorado with -v, or -vv on a small subset)