Insufficient memory to run inference on cuda:0

yplee614 commented 3 months ago

I ran Dorado with the command: dorado correct -m herro-v1/ /home/yplee/strawberry/SRR21142895.fastq > corrected.fasta 2> log The log file shows: [2024-08-04 17:30:50.624] [info] Running: "correct" "-m" "herro-v1/" "/home/yplee/strawberry/SRR21142895.fastq" terminate called after throwing an instance of 'std::runtime_error' what(): Insufficient memory to run inference on cuda:0

Run environment:

Dorado version: 0.7.3 Dorado command: dorado correct -m herro-v1/ /home/yplee/strawberry/SRR21142895.fastq > corrected.fasta 2> log Operating system: Ubuntu 24.04 Hardware (CPUs, Memory, GPUs): AMD 9654, 512Gb, NVIDIA 4070 super (12Gb)

kubek78 commented 3 months ago

According to main page: "The error correction tool is both compute and memory intensive. As a result, it is best run on a system with multiple high performance CPU cores ( > 64 cores), large system memory ( > 256GB) and a modern GPU with a large VRAM ( > 32GB)." I was able to run the correction on 4090 with 24GB but it's unlikely to works with 12 GB.

shelkmike commented 3 months ago

I have the same problem with Dorado 0.7.3. However, Dorado 0.7.2 worked without this problem on exactly the same input file. Either VRAM requirements of Dorado increased, or a bug was introduced. My GPU is GeForce 2080Ti 12GB.

HalfPhoton commented 3 months ago

Hi @yplee614, yes as @kubek78 said and shared from the docs there is a very high resource requirement to run dorado correct.

@shelkmike, changes in dorado 0.7.3 resulted an increase in an resource requirements but should not exceed our stated recommendations.

Kind regards, Rich

KeygeneICT commented 3 months ago

We are trying to run a dorado(0.7.3) correct job but are overflowing our available memory. input file is ~130GB case 1: 4x A100 80GB VRAM + 96 threads + 512GB ram dorado correct -x cuda:all input_file > output_file > out of memory

case 2: 1x A100 80GB VRAM + 96 threads + 512GB ram dorado correct -x cuda:0 input_file > output_file > out of memory

case 3: 4x A100 80GB VRAM + 96 threads + 1TB ram dorado correct -x cuda:all input_file > output_file > 997/1008GB about to run out of memory. The output file is generated but never contains data. edit: ran out of ram on 1TB machine. our input file is .fastq, output .fasta. I am running case 3 on Dorado v0.7.2 now.

HalfPhoton commented 3 months ago

@KeygeneICT, thanks for the information. Approximately what depth is your input data?

Kind regards, Rich

KeygeneICT commented 3 months ago

thanks for the information. Approximately what depth is your input data?

I have received the following information about this input data: "50-60 Gb simplex .fasta data, ~100-120X coverage (assuming ~0.5Gb diploid heterozygous genome)"

Update: Running the same dataset on Dorado v0.7.2 has so far only consumed at most 300GB ram and seems to fit well within the resources available. The output file is growing properly as well (18GB currently).

Update2: We are only experiencing excessive memory usage on the v0.7.3. v0.7.2 is working correctly so we will wait for a new release before upgrading from 0.7.2.

simonhayns commented 3 months ago

FWIW, I had similar issues running error correction. I was getting:

Insufficient memory to run inference on cuda:0

I managed to get around this using:

dorado correct --infer-threads 1 -b 64

It's faster using these arguments with 0.7.3 than reverting to 0.7.2. Watching with nvtop, I could see the GPU working harder using 0.7.3. We're only running 4070 GPUs, as it's early days for us and we're just testing the waters.

shelkmike commented 2 months ago

@simonhayns Thank you. "--infer-threads 1 -b 64" and "--infer-threads 1 -b 32" still required too much VRAM, but "--infer-threads 1 -b 16" worked for me.

HalfPhoton commented 2 months ago

@KeygeneICT, Could you try dorado-0.8.0 which has a number of stability improvements to dorado correct?

Thanks to all for their suggestions on reducing VRAM, there's also been updates to the dorado readme regarding dorado correct input data requirements.

Kind regards, Rich

KeygeneICT commented 1 month ago

@KeygeneICT, Could you try dorado-0.8.0 which has a number of stability improvements to dorado correct?

Apologies for the delay, I was able to test 0.8.0 last week using the same dataset as before and we ran into the same situation we had with 0.7.3. I also tested 0.8.1 just now with --to-paf in order to exclude anything related to GPUs. No data output(empty file), 100% cpu load, stuck on "Loading alignments", memory slowly filling up until it reaches the 1TB RAM limit on the machine I was testing this on.

nanoporetech / dorado

Insufficient memory to run inference on cuda:0 #972