nanoporetech / dorado

Oxford Nanopore's Basecaller
https://nanoporetech.com/
Other
477 stars 59 forks source link

Invalid signature in file #362

Closed LauraSkak closed 11 months ago

LauraSkak commented 1 year ago

I'm working on a set of fast5 files that i would like to do basecalling on.

So far i've converted the fast5 files one by one to pod5 files using the pod5 software. If there is a clever way to way to merge all the fast5 files, so i only have to do one convertion; let me know. I took out a subset of 10 pod5 files each containing 4000 reads and ran the following;

dorado basecaller dna_r10.4.1_e8.2_400bps_hac@v4.1.0 test_pod5s/ > test_calls.bam

This worked and gave me no errors. When i run the same command with all the pod5 files i

[2023-09-06 14:37:47.195] [error] Failed to open file /PAM26135_pass_35aa25e5_b919f884_1225.pod5: IOError: Invalid signature in file

and

[2023-09-06 14:49:41.135] [error] Failed to open file /PAM26135_pass_35aa25e5_b919f884_1393.pod5: Invalid: null file passed to C API

what do these mean? how do i make it run?

HalfPhoton commented 1 year ago

Hi @LauraSkak ,

How did you generate the pod5 files which are causing issues?

You shouldn't need to merge the fast5s into one file for conversion to pod5. You can do one conversion for a directory of fast5 files which will create one monolithic pod5 file:

pod5 convert fast5 /directory/of/fast5s/ --output converted.pod5

You can also use the --recursive argument to search all sub-directories. More examples in the documentation

Best regards, Rich

LauraSkak commented 12 months ago

Hi Rich All fast5 files were converted with the following command;

pod5 convert fast5 {infile} -o {outfile}

so i'm guessing exactly what you wrote.

Thank you so much for the fast response!

HalfPhoton commented 11 months ago

Hi @LauraSkak ,

Just to clarify a point here. You asked:

If there is a clever way to way to merge all the fast5 files, so i only have to do one convertion; let me know

You can convert multiple fast5 files in a directory by passing the directory to pod5 convert fast5 to create one output.

pod5 convert fast5 {directory_of_fast5s} --recursive -o {merged_output.pod5}

Your issue could be a corrupt / unusual fast5 file.

Which pod5 python version are you using - you can see this with pod5 --version. If it's not 0.2.4 please upgrade if possible by using pip install -U pod5

Can you try the following to resolve the issue:

If there are issues in conversion, please exclude the fast5 files and re-try. If possible, we can arrange for you to send us any problematic fast5 files for us to inspect and resolve.

Kind regards, Rich

LauraSkak commented 11 months ago

My pod5 files were corrupted :) I reran the the conversion and then it didn't complain :)

HalfPhoton commented 11 months ago

Thanks for the update @LauraSkak - closing as resolved