nanoporetech / dorado

Oxford Nanopore's Basecaller
https://nanoporetech.com/
Other
493 stars 59 forks source link

Dorado correct fails to run #833

Closed samuelmontgomery closed 3 months ago

samuelmontgomery commented 4 months ago

Hi,

I am trying out dorado correct on a 100x coverage set of reads of a bacteria for de novo assembly (~1GB of data), and it just doesn't seem to run at all? This is using the windows binary with 24 CPUs, 64Gb system RAM, RTX4090 GPU

Running the following command gets some output initially, but after looking for the index it just exits the process back out to shell

C:\dorado_0.7.0\bin\dorado.exe correct -t 12 --verbose .\PAO1_subset.fastq > .\PAO1_corrected.fasta

[2024-05-23 12:10:09.859] [info] Running: "correct" "-t" "12" "--verbose" "C.\PAO1_subset.fastq" [2024-05-23 12:10:09.859] [debug] > aligner threads 12, corrector threads 4, writer threads 1 [2024-05-23 12:10:09.861] [info] - downloading herro-v1 with httplib [2024-05-23 12:10:11.318] [debug] Usable memory for dev cuda:0: 17.6 GB [2024-05-23 12:10:11.318] [debug] Using batch size 16 on device cuda:0 [2024-05-23 12:10:11.318] [debug] Usable memory for dev cuda:0: 17.6 GB [2024-05-23 12:10:11.318] [debug] Using batch size 16 on device cuda:0 [2024-05-23 12:10:11.318] [debug] Starting process thread for cuda:0! [2024-05-23 12:10:11.318] [debug] Starting process thread for cuda:0! [2024-05-23 12:10:11.318] [debug] Starting decode thread! [2024-05-23 12:10:11.318] [debug] Looking for idx C:\Nanopore\05DEC23_EXPMH100\herro\PAO1_subset.fastq.fai [2024-05-23 12:10:11.318] [debug] Starting decode thread! [2024-05-23 12:10:11.318] [debug] Starting decode thread! [2024-05-23 12:10:11.319] [debug] Starting decode thread! C:\Windows\System32>

Is there a specific way I need to prepare my fastq file? This data is simply SUP basecalled and subset into a smaller set, but looking at the herro github, do they need to be mapped then extracted back as fastq files from the BAM?

As a side note - may want to specify in the readme that you cannot use a fastq.gz compressed using gzip and need to use bgzip instead - as i think gzip is still very common

tijyojwad commented 4 months ago

Hi @samuelmontgomery - thank you for reporting the error! Indeed looks like we have a bug in freeing a pointer on Windows which is causing the segafult. We will push a fix for this ASAP. Are you able to run this on a Linux platform (or WSL) by any chance?

tijyojwad commented 3 months ago

Hi @samuelmontgomery - this was fixed with dorado v0.7.1. Please use the latest binary and let us know if you run into any issues!