jts / nanopolish

Signal-level algorithms for MinION data
MIT License
569 stars 159 forks source link

Bad fast5 error with nanopolish eventalign #805

Open balberti opened 4 years ago

balberti commented 4 years ago

Hello,

I'm running nanopolish index and eventalign on a puc19 dataset downloaded on SRA with the accession SRR5219626. I have downloaded the fast5 files and I used a fast5 to fastq converter (https://github.com/rrwick/Fast5-to-Fastq) to get the fastq corresponding files. (I cannot run guppy on my own) The ref plasmid is coming from New England Biolabs.

I ran nanopolish index -d gdna_fast5_files/ gdna.fastq (without the sequencing summary file because I don't have it) I have the following warnings nanopolish_index But index, index.fai, index.gzi, and index.readdb are created and not empty.

Then, when I'm running eventalign I get eventalign (The bam file was generated with minimap2 and samtools by mapping gdna.fastq on the reference plasmid)

Does this error occur because I'm not using a basecaller on the fast5 files ? Is the sequencing summary file mandatory for the indexing ? Is it possible to recreate manually this file or to get around this error ?

Thank you in advance for your time Best regards, Baptiste Alberti.

jts commented 4 years ago

Hi,

I think that data is very old so the format is probably no longer supported by nanopolish. If it is R9 data, you may try ONT's tools to reformat it (single-to-multi). If it is R7 data you will have to try to use an older version of nanopolish.

Jared

balberti commented 4 years ago

Thank you for your quick answer. I will try to reformat it as you indicate. I'll let you know if the error disappears after that.

Best, Baptiste.

balberti commented 4 years ago

Hello Jared,

I'm sorry, but I tried both solutions to counter my problem. Sadly, the error is still there even with multi fast5 files. I also ran index and eventalign commands with nanopolish version from the current one to the v0.5 but nothing worked. eventalign_attempt2

Is the issue coming from the dataset itself or from the fact that the fastq I have was not created by guppy?

Thanks and have a nice day, Baptiste.