LooseLab / readfish

CLI tool for flexible and fast adaptive sampling on ONT sequencers
https://looselab.github.io/readfish/
GNU General Public License v3.0
169 stars 33 forks source link

Failed to playback own fast5 #182

Closed Kevinzjy closed 2 years ago

Kevinzjy commented 2 years ago

Hi, I generated some bulk fast5 in our previous sequencing runs using MinKNOW (20.10.3), but something strange happened.

  1. The bulk fast5 is generated using the "Bulk file" option in MinKNOW, and the file size (~100G) seems fine.
  2. The playback kept failing due to the error "Stopping protocol due to internal error. All data will be saved." when using my fast5 as simulation input. However, I've tried the example data from your manual and it works fine.
  3. I tried to check my fast5 using bulkVis, and bulkVis could not recognize these files as valid bulk fast5.
  4. The h5dump command shows multiple "h5dump error: unable to print data" errors when inspecting my fast5 files, but the bulk fast5 downloaded from nanopore-wgs-consortium can be dumped correctly.

So it seems that this is a bulk fast5 format issue, maybe the bulk output is corrupted in MinKNOW? I'm aware that this is not a readfish issue, but I'm wondering if you have encountered the same situation in the recent version of MinKNOW.

Thanks for your help.

Jinyang

mattloose commented 2 years ago

HI,

There are two possible explanations.

One is that the fast5 file is corrupted (most likely given the fact that neither bulkVis nor h5dump will read them. I think this is most likely.

The second possible issue is that raw data were not recorded in the bulkfile (and only event data). However - I suspect that this is not the case here. Most likely you have a corrupt bulk file.

To avoid this we typically record bulkfiles for only a short period of time. The maximum time we would record for is around 4 hours (in one file).

Sorry for not being able to rescue your file.

Kevinzjy commented 2 years ago

Thanks for the prompt reply and useful suggestion, I will try this in our next run.