Closed fidibidi closed 1 month ago
Something worth noting have referenced other similar issues: #604 #427
This appears to be an issue with running dorado via Powershell... I ran a small test via CMD, in which I interrupted a run, and was able to successfully resume from the incomplete bam file...
I feel like it'd be very helpful for future folks to have this stated in the README.md for running dorado on Windows machines... save many days of troubleshooting.
Issue Report
Please describe the issue:
Via powershell; Dorado run was interrupted a couple days into basecalling; resulting in unfinished bam file. I was hoping to resume from this file, but attempts to use the --resume-from command have failed.
Steps to reproduce the issue:
essentially just run the command but with the --resume-from. I've wondered if i'm just incorrectly pathing to the file. but attempts at declaring absolute path the the data file haven't worked either
C:\Users\ONT\A1815.local.bam .\A1815.local.bam
It is worth noting perhaps that the bam file generated is rather large... 238GB, which I don't think should be the case...
Run environment:
then to resume:
dorado basecaller hac,5mCG_5hmCG F:\Data\081524_P2_A1815\081524_P2_A1815\20240815_1132_P2S-00718-A_PAY91898_867b13e9/pod5 --reference C:\Users\ONT\Documents\GCA_000001405.15_GRCh38_no_alt_analysis_set.fna --resume-from A1815.local.bam > F:\Data\081524_P2_A1815\bam\A1815.bam
Logs
[2024-09-19 12:10:46.337] [info] Running: "basecaller" "hac,5mCG_5hmCG" "F:\Data\081524_P2_A1815\081524_P2_A1815\20240815_1132_P2S-00718-A_PAY91898_867b13e9/pod5" "--reference" "C:\Users\ONT\Documents\GCA_000001405.15_GRCh38_no_alt_analysis_set.fna" "--resume-from" "A1815.local.bam" [2024-09-19 12:10:46.405] [info] - downloading dna_r10.4.1_e8.2_400bps_hac@v5.0.0 with httplib [2024-09-19 12:10:46.870] [info] - downloading dna_r10.4.1_e8.2_400bps_hac@v5.0.0_5mCG_5hmCG@v1 with httplib [2024-09-19 12:10:47.275] [info] Normalised: chunksize 10000 -> 9996 [2024-09-19 12:10:47.276] [info] Normalised: overlap 500 -> 498 [2024-09-19 12:10:47.277] [info] > Creating basecall pipeline [2024-09-19 12:10:54.926] [info] cuda:0 using chunk size 9996, batch size 1152 [2024-09-19 12:10:55.517] [info] cuda:0 using chunk size 4998, batch size 1408 [2024-09-19 12:11:38.079] [info] > Inspecting resume file... [2024-09-19 12:11:43.053] [error] finalise() not called on a HtsFile. [2024-09-19 12:11:43.054] [error] Could not open file: A1815.local.bam