nanoporetech / dorado

Oxford Nanopore's Basecaller
https://nanoporetech.com/
Other
445 stars 54 forks source link

finalise() HtsFile error #761

Closed jacobscgc closed 2 months ago

jacobscgc commented 2 months ago

Issue Report

Please describe the issue:

I am trying to run dorado on a .pod5 file. However, when I do so, I get an error:

[2024-04-19 23:00:41.196] [error] finalise() not called on a HtsFile.

I have been googling the issue but cannot find a resolution. I upgraded my samtools version to v1.20.

Am I missing something or what can I do to resolve this? Thanks for the help!

Steps to reproduce the issue:

Please list any steps to reproduce the issue.

Run environment:

Logs

[2024-04-19 23:05:58.008] [info] Running: "basecaller" "-vv" "./dna_r10.4.1_e8.2_400bps_hac@v4.3.0" "/mnt/c/Users/cgcja/Proton Drive/cgc.jacobs/My files//SweetLakeAnalytics/Data/Nanopore/Danio_rerio/input/Zebrafish_2/Zebrafish_2/20240219_2014_MN25856_ASD381_e15082f1/pod5_pass/" "--reference" "/mnt/c/Users/cgcja/Proton Drive/cgc.jacobs/My files//SweetLakeAnalytics/Data/Genomes/Danio_rerio/ncbi_dataset/GCF_000002035.6/GCF_000002035.6_GRCz11_genomic.fna" [2024-04-19 23:05:58.012] [trace] Model option: './dna_r10.4.1_e8.2_400bps_hac@v4.3.0' unknown - assuming path [2024-04-19 23:05:58.012] [info] > Creating basecall pipeline [2024-04-19 23:06:01.316] [debug] cuda:0 memory available: 3.92GB [2024-04-19 23:06:01.316] [debug] cuda:0 memory limit 2.92GB [2024-04-19 23:06:01.316] [debug] cuda:0 maximum safe estimated batch size at chunk size 9996 is 384 [2024-04-19 23:06:01.316] [debug] cuda:0 maximum safe estimated batch size at chunk size 4998 is 832 [2024-04-19 23:06:01.316] [debug] Auto batchsize cuda:0: testing up to 832 in steps of 64 [2024-04-19 23:06:01.416] [trace] Auto batchsize cuda:0: iteration:0, ms/chunk 1.487390 ms [2024-04-19 23:06:01.513] [trace] Auto batchsize cuda:0: iteration:1, ms/chunk 1.503824 ms [2024-04-19 23:06:01.513] [debug] Auto batchsize cuda:0: 64, time per chunk 1.487390 ms [2024-04-19 23:06:01.612] [trace] Auto batchsize cuda:0: iteration:0, ms/chunk 0.739000 ms [2024-04-19 23:06:01.710] [trace] Auto batchsize cuda:0: iteration:1, ms/chunk 0.759088 ms [2024-04-19 23:06:01.710] [debug] Auto batchsize cuda:0: 128, time per chunk 0.739000 ms [2024-04-19 23:06:01.814] [trace] Auto batchsize cuda:0: iteration:0, ms/chunk 0.505792 ms [2024-04-19 23:06:01.914] [trace] Auto batchsize cuda:0: iteration:1, ms/chunk 0.517579 ms [2024-04-19 23:06:01.914] [debug] Auto batchsize cuda:0: 192, time per chunk 0.505792 ms [2024-04-19 23:06:02.018] [trace] Auto batchsize cuda:0: iteration:0, ms/chunk 0.371320 ms [2024-04-19 23:06:02.118] [trace] Auto batchsize cuda:0: iteration:1, ms/chunk 0.388156 ms [2024-04-19 23:06:02.118] [debug] Auto batchsize cuda:0: 256, time per chunk 0.371320 ms [2024-04-19 23:06:02.222] [trace] Auto batchsize cuda:0: iteration:0, ms/chunk 0.298234 ms [2024-04-19 23:06:02.323] [trace] Auto batchsize cuda:0: iteration:1, ms/chunk 0.314330 ms [2024-04-19 23:06:02.323] [debug] Auto batchsize cuda:0: 320, time per chunk 0.298234 ms [2024-04-19 23:06:02.428] [trace] Auto batchsize cuda:0: iteration:0, ms/chunk 0.245379 ms [2024-04-19 23:06:02.530] [trace] Auto batchsize cuda:0: iteration:1, ms/chunk 0.262965 ms [2024-04-19 23:06:02.530] [debug] Auto batchsize cuda:0: 384, time per chunk 0.245379 ms [2024-04-19 23:06:02.636] [trace] Auto batchsize cuda:0: iteration:0, ms/chunk 0.210432 ms [2024-04-19 23:06:02.738] [trace] Auto batchsize cuda:0: iteration:1, ms/chunk 0.226921 ms [2024-04-19 23:06:02.738] [debug] Auto batchsize cuda:0: 448, time per chunk 0.210432 ms [2024-04-19 23:06:02.845] [trace] Auto batchsize cuda:0: iteration:0, ms/chunk 0.181670 ms [2024-04-19 23:06:02.948] [trace] Auto batchsize cuda:0: iteration:1, ms/chunk 0.199572 ms [2024-04-19 23:06:02.948] [debug] Auto batchsize cuda:0: 512, time per chunk 0.181670 ms [2024-04-19 23:06:03.061] [trace] Auto batchsize cuda:0: iteration:0, ms/chunk 0.167124 ms [2024-04-19 23:06:03.165] [trace] Auto batchsize cuda:0: iteration:1, ms/chunk 0.179801 ms [2024-04-19 23:06:03.165] [debug] Auto batchsize cuda:0: 576, time per chunk 0.167124 ms [2024-04-19 23:06:03.289] [trace] Auto batchsize cuda:0: iteration:0, ms/chunk 0.160822 ms [2024-04-19 23:06:03.395] [trace] Auto batchsize cuda:0: iteration:1, ms/chunk 0.165170 ms [2024-04-19 23:06:03.395] [debug] Auto batchsize cuda:0: 640, time per chunk 0.160822 ms [2024-04-19 23:06:03.534] [trace] Auto batchsize cuda:0: iteration:0, ms/chunk 0.165460 ms [2024-04-19 23:06:03.654] [trace] Auto batchsize cuda:0: iteration:1, ms/chunk 0.168556 ms [2024-04-19 23:06:03.654] [debug] Auto batchsize cuda:0: 704, time per chunk 0.165460 ms [2024-04-19 23:06:03.801] [trace] Auto batchsize cuda:0: iteration:0, ms/chunk 0.159325 ms [2024-04-19 23:06:03.928] [trace] Auto batchsize cuda:0: iteration:1, ms/chunk 0.163988 ms [2024-04-19 23:06:03.928] [debug] Auto batchsize cuda:0: 768, time per chunk 0.159325 ms [2024-04-19 23:06:04.093] [trace] Auto batchsize cuda:0: iteration:0, ms/chunk 0.167220 ms [2024-04-19 23:06:04.236] [trace] Auto batchsize cuda:0: iteration:1, ms/chunk 0.171311 ms [2024-04-19 23:06:04.236] [debug] Auto batchsize cuda:0: 832, time per chunk 0.167220 ms [2024-04-19 23:06:04.242] [debug] Largest batch size for cuda:0: 768, time per chunk 0.159325 ms

[2024-04-19 23:06:04.242] [info] cuda:0 using chunk size 9996, batch size 384 [2024-04-19 23:06:04.242] [debug] cuda:0 Model memory 1.80GB [2024-04-19 23:06:04.242] [debug] cuda:0 Decode memory 0.74GB [2024-04-19 23:06:04.793] [info] cuda:0 using chunk size 4998, batch size 768 [2024-04-19 23:06:04.793] [debug] cuda:0 Model memory 1.80GB [2024-04-19 23:06:04.793] [debug] cuda:0 Decode memory 0.74GB [2024-04-19 23:06:05.230] [debug] - adjusted chunk size to match model stride: 10000 -> 9996 [2024-04-19 23:06:05.242] [trace] > Index parameters input by user: kmer size=15 and window size=10. [2024-04-19 23:06:05.242] [trace] > Index parameters input by user: batch size=16000000000 and mini batch size=16000000000. [2024-04-19 23:06:05.242] [trace] > Map parameters input by user: bandwidth=500 and bandwidth long=20000. [2024-04-19 23:06:05.242] [trace] > Map parameters input by user: don't print secondary=false and best n secondary=5. [2024-04-19 23:06:05.242] [trace] > Map parameters input by user: soft clipping=false and secondary seq=false. [2024-04-19 23:06:05.242] [debug] > Map parameters input by user: dbg print qname=false and aln seq=false. [2024-04-19 23:06:05.243] [error] finalise() not called on a HtsFile.

QGouil commented 2 months ago

it could be because the model "./dna_r10.4.1_e8.2_400bps_hac@v4.3.0" is not in your current directory. Either give the full path to the model or run dorado download --model dna_r10.4.1_e8.2_400bps_hac@v4.3.0 first.

jacobscgc commented 2 months ago

I have tried to provide the absolute path to the model as suggested, that however does not resolve the issue. I have printed an ls of the paths provided to the pod5 file and model folder at the top, which shows that the paths are correct. I still get the same error, any other ideas perhaps? Or should I try it on my linux device and see whether it has something to do with running it in WSL?

The LOG:

ASD381_pass_e15082f1_a951eb1a_0.pod5 0.conv.bias.tensor 2.conv.bias.tensor 4.rnn.weight_hh_l0.tensor 5.rnn.weight_hh_l0.tensor 6.rnn.weight_hh_l0.tensor 7.rnn.weight_hh_l0.tensor 8.rnn.weight_hh_l0.tensor 0.conv.weight.tensor 2.conv.weight.tensor 4.rnn.weight_ih_l0.tensor 5.rnn.weight_ih_l0.tensor 6.rnn.weight_ih_l0.tensor 7.rnn.weight_ih_l0.tensor 8.rnn.weight_ih_l0.tensor 1.conv.bias.tensor 4.rnn.bias_hh_l0.tensor 5.rnn.bias_hh_l0.tensor 6.rnn.bias_hh_l0.tensor 7.rnn.bias_hh_l0.tensor 8.rnn.bias_hh_l0.tensor 9.linear.weight.tensor 1.conv.weight.tensor 4.rnn.bias_ih_l0.tensor 5.rnn.bias_ih_l0.tensor 6.rnn.bias_ih_l0.tensor 7.rnn.bias_ih_l0.tensor 8.rnn.bias_ih_l0.tensor config.toml [2024-04-21 10:36:43.883] [info] Running: "basecaller" "-vv" "/mnt/c/Users/cgcja/PycharmProjects/SL_ZF_Nanopore/dna_r10.4.1_e8.2_400bps_hac@v4.3.0" "/mnt/c/Users/cgcja/Proton Drive/cgc.jacobs/My files//SweetLakeAnalytics/Data/Nanopore/Danio_rerio/input/Zebrafish_2/Zebrafish_2/20240219_2014_MN25856_ASD381_e15082f1/pod5_pass/" "--reference" "/mnt/c/Users/cgcja/Proton Drive/cgc.jacobs/My files//SweetLakeAnalytics/Data/Genomes/Danio_rerio/ncbi_dataset/GCF_000002035.6/GCF_000002035.6_GRCz11_genomic.fna" [2024-04-21 10:36:43.888] [trace] Model option: '/mnt/c/Users/cgcja/PycharmProjects/SL_ZF_Nanopore/dna_r10.4.1_e8.2_400bps_hac@v4.3.0' unknown - assuming path [2024-04-21 10:36:43.889] [info] > Creating basecall pipeline [2024-04-21 10:36:48.610] [debug] cuda:0 memory available: 3.92GB [2024-04-21 10:36:48.610] [debug] cuda:0 memory limit 2.92GB [2024-04-21 10:36:48.610] [debug] cuda:0 maximum safe estimated batch size at chunk size 9996 is 384 [2024-04-21 10:36:48.610] [debug] cuda:0 maximum safe estimated batch size at chunk size 4998 is 832 [2024-04-21 10:36:48.610] [debug] Auto batchsize cuda:0: testing up to 832 in steps of 64 [2024-04-21 10:36:48.719] [trace] Auto batchsize cuda:0: iteration:0, ms/chunk 1.563964 ms [2024-04-21 10:36:48.810] [trace] Auto batchsize cuda:0: iteration:1, ms/chunk 1.423597 ms [2024-04-21 10:36:48.810] [debug] Auto batchsize cuda:0: 64, time per chunk 1.423597 ms [2024-04-21 10:36:48.902] [trace] Auto batchsize cuda:0: iteration:0, ms/chunk 0.682872 ms [2024-04-21 10:36:48.993] [trace] Auto batchsize cuda:0: iteration:1, ms/chunk 0.709136 ms [2024-04-21 10:36:48.993] [debug] Auto batchsize cuda:0: 128, time per chunk 0.682872 ms [2024-04-21 10:36:49.088] [trace] Auto batchsize cuda:0: iteration:0, ms/chunk 0.457627 ms [2024-04-21 10:36:49.181] [trace] Auto batchsize cuda:0: iteration:1, ms/chunk 0.482619 ms [2024-04-21 10:36:49.181] [debug] Auto batchsize cuda:0: 192, time per chunk 0.457627 ms [2024-04-21 10:36:49.276] [trace] Auto batchsize cuda:0: iteration:0, ms/chunk 0.342576 ms [2024-04-21 10:36:49.370] [trace] Auto batchsize cuda:0: iteration:1, ms/chunk 0.363684 ms [2024-04-21 10:36:49.370] [debug] Auto batchsize cuda:0: 256, time per chunk 0.342576 ms [2024-04-21 10:36:49.467] [trace] Auto batchsize cuda:0: iteration:0, ms/chunk 0.277670 ms [2024-04-21 10:36:49.561] [trace] Auto batchsize cuda:0: iteration:1, ms/chunk 0.293277 ms [2024-04-21 10:36:49.561] [debug] Auto batchsize cuda:0: 320, time per chunk 0.277670 ms [2024-04-21 10:36:49.659] [trace] Auto batchsize cuda:0: iteration:0, ms/chunk 0.229301 ms [2024-04-21 10:36:49.755] [trace] Auto batchsize cuda:0: iteration:1, ms/chunk 0.248947 ms [2024-04-21 10:36:49.755] [debug] Auto batchsize cuda:0: 384, time per chunk 0.229301 ms [2024-04-21 10:36:49.855] [trace] Auto batchsize cuda:0: iteration:0, ms/chunk 0.196530 ms [2024-04-21 10:36:49.951] [trace] Auto batchsize cuda:0: iteration:1, ms/chunk 0.212210 ms [2024-04-21 10:36:49.951] [debug] Auto batchsize cuda:0: 448, time per chunk 0.196530 ms [2024-04-21 10:36:50.052] [trace] Auto batchsize cuda:0: iteration:0, ms/chunk 0.170618 ms [2024-04-21 10:36:50.151] [trace] Auto batchsize cuda:0: iteration:1, ms/chunk 0.191194 ms [2024-04-21 10:36:50.151] [debug] Auto batchsize cuda:0: 512, time per chunk 0.170618 ms [2024-04-21 10:36:50.263] [trace] Auto batchsize cuda:0: iteration:0, ms/chunk 0.165073 ms [2024-04-21 10:36:50.364] [trace] Auto batchsize cuda:0: iteration:1, ms/chunk 0.173806 ms [2024-04-21 10:36:50.364] [debug] Auto batchsize cuda:0: 576, time per chunk 0.165073 ms [2024-04-21 10:36:50.487] [trace] Auto batchsize cuda:0: iteration:0, ms/chunk 0.160773 ms [2024-04-21 10:36:50.594] [trace] Auto batchsize cuda:0: iteration:1, ms/chunk 0.166267 ms [2024-04-21 10:36:50.594] [debug] Auto batchsize cuda:0: 640, time per chunk 0.160773 ms [2024-04-21 10:36:50.734] [trace] Auto batchsize cuda:0: iteration:0, ms/chunk 0.166291 ms [2024-04-21 10:36:50.853] [trace] Auto batchsize cuda:0: iteration:1, ms/chunk 0.168383 ms [2024-04-21 10:36:50.853] [debug] Auto batchsize cuda:0: 704, time per chunk 0.166291 ms [2024-04-21 10:36:50.999] [trace] Auto batchsize cuda:0: iteration:0, ms/chunk 0.160075 ms [2024-04-21 10:36:51.126] [trace] Auto batchsize cuda:0: iteration:1, ms/chunk 0.165259 ms [2024-04-21 10:36:51.126] [debug] Auto batchsize cuda:0: 768, time per chunk 0.160075 ms [2024-04-21 10:36:51.291] [trace] Auto batchsize cuda:0: iteration:0, ms/chunk 0.166962 ms [2024-04-21 10:36:51.433] [trace] Auto batchsize cuda:0: iteration:1, ms/chunk 0.170633 ms [2024-04-21 10:36:51.433] [debug] Auto batchsize cuda:0: 832, time per chunk 0.166962 ms [2024-04-21 10:36:51.438] [debug] Largest batch size for cuda:0: 768, time per chunk 0.160075 ms

[2024-04-21 10:36:51.438] [info] cuda:0 using chunk size 9996, batch size 384 [2024-04-21 10:36:51.438] [debug] cuda:0 Model memory 1.80GB [2024-04-21 10:36:51.438] [debug] cuda:0 Decode memory 0.74GB [2024-04-21 10:36:51.946] [info] cuda:0 using chunk size 4998, batch size 768 [2024-04-21 10:36:51.946] [debug] cuda:0 Model memory 1.80GB [2024-04-21 10:36:51.946] [debug] cuda:0 Decode memory 0.74GB [2024-04-21 10:36:52.407] [debug] - adjusted chunk size to match model stride: 10000 -> 9996 [2024-04-21 10:36:52.421] [trace] > Index parameters input by user: kmer size=15 and window size=10. [2024-04-21 10:36:52.421] [trace] > Index parameters input by user: batch size=16000000000 and mini batch size=16000000000. [2024-04-21 10:36:52.421] [trace] > Map parameters input by user: bandwidth=500 and bandwidth long=20000. [2024-04-21 10:36:52.421] [trace] > Map parameters input by user: don't print secondary=false and best n secondary=5. [2024-04-21 10:36:52.421] [trace] > Map parameters input by user: soft clipping=false and secondary seq=false. [2024-04-21 10:36:52.421] [debug] > Map parameters input by user: dbg print qname=false and aln seq=false. [2024-04-21 10:36:52.423] [error] finalise() not called on a HtsFile.

vellamike commented 2 months ago

Can you check if this happens if you don't provide a reference please?

jacobscgc commented 2 months ago

Damn.. I tried what you suggested @vellamike and indeed it worked. It happened to be an error in the file path to the genome.. Stupid of me and sorry for the bother.

I checked all file paths but that one apparently, sorry about that.