mateidavid / fast5

A C++ header-only library for reading Oxford Nanopore Fast5 files
MIT License
53 stars 16 forks source link

Bus Error #17

Open hasindu2008 opened 6 years ago

hasindu2008 commented 6 years ago

When I run the tool f5ls on a fast5 file I get the following error. ... eventdetection/events/size=12393 (mean=92.6346, stdv=2.68832, start=28195744, length=7) basecall(0)/group_list=1D_000 basecall(0)/seq_size=6997 Bus error

Similar behaviour is observed for f5-full or any other tool (such as nanopolish) that uses fast5 library when accessing events in fast5 files. The information about the system which I get the error is as follows. Processor: ARMv7 Processor rev 3 Operating system: Ubuntu 16.04.3 LTS

The output from gdb and backtrace is attached herewith. gdb_out.txt

The bus error seems to have originated from inside the HDF functions. However, I do not think that this is a bug in the HDF library as the h5dump tool provided by HDF output the following without any issues. h5dump.txt

Can you shed some light on this to fix the issue?

hasindu2008 commented 6 years ago

After 2 days of hectic debugging, I think I found the issue. The data type for the Events group as given by h5dump is as follows,

         GROUP "BaseCalled_template" {
            DATASET "Events" {
               DATATYPE  H5T_COMPOUND {
                  H5T_IEEE_F64LE "mean";
                  H5T_IEEE_F64LE "start";
                  H5T_IEEE_F64LE "stdv";
                  H5T_IEEE_F64LE "length";
                  H5T_STRING {
                     STRSIZE 5;
                     STRPAD H5T_STR_NULLPAD;
                     CSET H5T_CSET_ASCII;
                     CTYPE H5T_C_S1;
                  } "model_state";
                  H5T_STD_I64LE "move";
                  H5T_IEEE_F32LE "weights";
                  **H5T_IEEE_F32LE "p_model_state**";

For p_model_state the data type is H5T_IEEE_F32LE. However, in fast5.hpp:393 it is declared as double instead of float as below.

struct Basecall_Event
{
    double mean;
    double stdv;
    double start;
    double length;
    **double p_model_state**;
    long long move;

The Bus error dissappeared after I did this change.

jts commented 6 years ago

Hi @hasindu2008,

Thanks for looking into this. I don't have access to an ARM development machine so I have limited ability to help.

This issue may have been caused by changes to ONT's fast5 file format (the internal data types have changed a few times in the past). Can you let me know which dataset caused the problem?

Thanks, Jared

hasindu2008 commented 6 years ago

Hi @jts

The dataset that I used was downloaded from (https://github.com/nanopore-wgs-consortium/NA12878/blob/master/Genome.md) under the FAST5 (Signal Level files) section. I am attaching one of those fast5 files here for your convenience. After I changed the data type for p_model_state from double to float, now things are working perfectly.

test.zip