nanoporetech / vbz_compression

VBZ compression plugin for nanopore signal data
https://nanoporetech.com/
Mozilla Public License 2.0
39 stars 9 forks source link

Nanoplish error while call methylation using VBZ-compressed fast5 #5

Closed BigNianNGS closed 4 years ago

BigNianNGS commented 5 years ago

Dear professors:

After compressing the raw fast5 file by ont_fast5_api. I try to use the vbz-compressed fast5 to call mathylation by Nanopolish software. The error occurs likes below:

HDF5-DIAG: Error detected in HDF5 (1.8.14) thread 47866940143360:

000: H5Dio.c line 173 in H5Dread(): can't read data

major: Dataset
minor: Read failed

001: H5Dio.c line 550 in H5D__read(): can't read data

major: Dataset
minor: Read failed

002: H5Dchunk.c line 1872 in H5D__chunk_read(): unable to read raw data chunk

major: Low-level I/O
minor: Read failed

003: H5Dchunk.c line 2902 in H5D__chunk_lock(): data pipeline read failed

major: Data filters
minor: Filter operation failed

004: H5Z.c line 1357 in H5Z_pipeline(): required filter 'vbz' is not registered

major: Data filters
minor: Read failed

005: H5PL.c line 298 in H5PL_load(): search in paths failed

major: Plugin for dynamically loaded library
minor: Can't get value

006: H5PL.c line 402 in H5PL__find(): can't open directory

major: Plugin for dynamically loaded library
minor: Can't open directory or file

I guess the vbz library was not be installed correctly ? And How can I use the vbz-compressed fast5 to call methlation ? Decompress or directly reading the vbz-compressed fast5 by other ways?

Best~ dale wong

0x55555555 commented 4 years ago

Hi Dale,

Can I ask what platform/operating system you are running on?

We are aware of some issues on ubnutu bionic where HDF doesnt load the plugin automatically.

Can you also try adding: export HDF5_PLUGIN_PATH=/usr/local/hdf5/lib/plugin to your shell before running the basecaller (in the same shell)

BigNianNGS commented 4 years ago

Thanks for your reply. I try to add new HDF5_PLUGIN_PATH and it works without error information.

My platform is Linux version 3.10.0-693.11.1.el7.x86_64.

My another quesition is who can I decompress the vbz-compressed fast5? Can I get the same raw fast5 files after decompression?

Best~

0x55555555 commented 4 years ago

You can - we are working on some tools in our fast5 api to do this for you, however in the mean time you could use h5repack to remove the vbz compression, something like:

>  h5repack -f GZIP=1 input.fast5 output.fast5

should repack using gzip not vbz.

0x55555555 commented 4 years ago

https://github.com/nanoporetech/ont_fast5_api has tools to repack fast5 files with or without vbz

mmullistb commented 2 years ago

Hi there- I've been running into this issue as well using nanopolish polya on an M1 macbook. I ran export HDF5_PLUGIN_PATH=/usr/local/hdf5/lib/plugin:$HDF_PLUGIN_PATH in a terminal window followed by nanopolish polya --threads=8 --reads=/path/to/joined.fastq --bam=/path/to/joined.sorted.bam --genome=/path/to/reference.fa > /path/to/polya_results.tsv in the same window, which resulted in the following output:

The fast5 file is compressed with VBZ but the required plugin is not loaded. Please read the instructions here: https://github.com/nanoporetech/vbz_compression/issues/5

Sorry if I'm overlooking something obvious here, but I'd appreciate any input you might have on this issue. Thanks!

0x55555555 commented 2 years ago

It sounds like you may have hit an issue with the vbz plugin on M1 platforms

Are you are running python in a rosetta environment?

A short term solution would be to build the VBZ plugin yourself on M1 - I will look to get M1 support for vbz rolled out in a future release.

iiSeymour commented 2 years ago

@mmullistb can you try with the M1 Mac build ont-vbz-hdf-plugin-1.0.1-Darwin-arm64.tar.gz from the release page here.

mmullistb commented 2 years ago

thanks @jorj1988 and @iiSeymour - I downloaded the M1 mac build and added the directory containing the binary to my path: export PATH=/path/to/ont-vbz-hdf-plugin-1.0.1-Darwin/bin:$PATH and confirmed with echo $PATH that it had been added.

I repeated the nanopolish polya command: nanopolish polya --threads=8 --reads=/path/to/joined.fastq --bam=/path/to/joined.sorted.bam --genome=/path/to/reference.fa > /path/to/polya_results.tsv

The output was the following:

terminal output HDF5-DIAG: Error detected in HDF5 (1.13.0) thread 1: #000: H5D.c line 1021 in H5Dread(): can't synchronously read data major: Dataset minor: Read failed #001: H5D.c line 970 in H5D__read_api_common(): can't read data major: Dataset minor: Read failed #002: H5VLcallback.c line 2079 in H5VL_dataset_read(): dataset read failed major: Virtual Object Layer minor: Read failed #003: H5VLcallback.c line 2046 in H5VL__dataset_read(): dataset read failed major: Virtual Object Layer minor: Read failed #004: H5VLnative_dataset.c line 294 in H5VL__native_dataset_read(): can't read data major: Dataset minor: Read failed #005: H5Dio.c line 262 in H5D__read(): can't read data major: Dataset minor: Read failed #006: H5Dchunk.c line 2575 in H5D__chunk_read(): unable to read raw data chunk major: Low-level I/O minor: Read failed #007: H5Dchunk.c line 3943 in H5D__chunk_lock(): data pipeline read failed major: Dataset minor: Filter operation failed #008: H5Z.c line 1359 in H5Z_pipeline(): required filter 'vbz' is not registered major: Data filters minor: Read failed #009: H5PLint.c line 257 in H5PL_load(): search in path table failed major: Plugin for dynamically loaded library minor: Can't get value #010: H5PLpath.c line 804 in H5PL__find_plugin_in_path_table(): search in path /usr/local/hdf5/lib/plugin encountered an error major: Plugin for dynamically loaded library minor: Can't get value #011: H5PLpath.c line 857 in H5PL__find_plugin_in_path(): can't open directory: /usr/local/hdf5/lib/plugin major: Plugin for dynamically loaded library minor: Can't open directory or file The fast5 file is compressed with VBZ but the required plugin is not loaded. Please read the instructions here: https://github.com/nanoporetech/vbz_compression/issues/5 HDF5-DIAG: Error detected in HDF5 (1.13.0) thread 2: #000: H5A.c line 1044 in H5Aread(): can't synchronously read data major: Attribute minor: Read failed #001: H5A.c line 1008 in H5A__read_api_common(): not an attribute major: Invalid arguments to routine minor: Inappropriate type HDF5-DIAG: Error detected in HDF5 (1.13.0) thread 2: #000: H5A.c line 2251 in H5Aclose(): decrementing attribute ID failed major: Attribute minor: Unable to decrement reference count #001: H5Iint.c line 1157 in H5I_dec_app_ref(): can't decrement ID ref count major: Object ID minor: Unable to decrement reference count #002: H5Iint.c line 1109 in H5I__dec_app_ref(): can't decrement ID ref count major: Object ID minor: Unable to decrement reference count #003: H5Iint.c line 1012 in H5I__dec_ref(): can't locate ID major: Object ID minor: Unable to find ID information (already closed?) HDF5-DIAG: Error detected in HDF5 (1.13.0) thread 2: #000: H5G.c line 889 in H5Gclose(): decrementing group ID failed major: Symbol table minor: Unable to decrement reference count #001: H5Iint.c line 1157 in H5I_dec_app_ref(): can't decrement ID ref count major: Object ID minor: Unable to decrement reference count #002: H5Iint.c line 1109 in H5I__dec_app_ref(): can't decrement ID ref count major: Object ID minor: Unable to decrement reference count #003: H5Iint.c line 1012 in H5I__dec_ref(): can't locate ID major: Object ID minor: Unable to find ID information (already closed?) HDF5-DIAG: Error detected in HDF5 (1.13.0) thread 2: #000: H5D.c line 397 in H5Dopen2(): unable to synchronously open dataset major: Dataset minor: Can't open object #001: H5D.c line 353 in H5D__open_api_common(): can't set object access arguments major: Dataset minor: Can't set value #002: H5VLint.c line 2669 in H5VL_setup_acc_args(): invalid location identifier major: Invalid arguments to routine minor: Inappropriate type #003: H5VLint.c line 1779 in H5VL_vol_object(): invalid identifier major: Invalid arguments to routine minor: Inappropriate type HDF5-DIAG: Error detected in HDF5 (1.13.0) thread 2: #000: H5G.c line 438 in H5Gopen2(): unable to synchronously open group major: Symbol table minor: Unable to create file #001: H5G.c line 395 in H5G__open_api_common(): can't set object access arguments major: Symbol table minor: Can't set value #002: H5VLint.c line 2669 in H5VL_setup_acc_args(): invalid location identifier major: Invalid arguments to routine minor: Inappropriate type #003: H5VLint.c line 1779 in H5VL_vol_object(): invalid identifier major: Invalid arguments to routine minor: Inappropriate type HDF5-DIAG: Error detected in HDF5 (1.13.0) thread 2: #000: H5F.c line 1061 in H5Fclose(): decrementing file ID failed major: File accessibility minor: Unable to close file #001: H5Iint.c line 1157 in H5I_dec_app_ref(): can't decrement ID ref count major: Object ID minor: Unable to decrement reference count #002: H5Iint.c line 1109 in H5I__dec_app_ref(): can't decrement ID ref count major: Object ID minor: Unable to decrement reference count #003: H5Iint.c line 1012 in H5I__dec_ref(): can't locate ID major: Object ID minor: Unable to find ID information (already closed?) HDF5-DIAG: Error detected in HDF5 (1.13.0) thread 3: #000: H5D.c line 1021 in H5Dread(): can't synchronously read data major: Dataset minor: Read failed

The line "The fast5 file is compressed with VBZ but the required plugin is not loaded. Please read the instructions here: https://github.com/nanoporetech/vbz_compression/issues/5" suggests that nanopolish is still not able to access the VBZ plugin, right?

iiSeymour commented 2 years ago

@mmullistb did you use PATH not HDF5_PLUGIN_PATH?

mmullistb commented 2 years ago

@iiSeymour haha yes, my mistake. It's working now when using HDF5_PLUGIN_PATH. Thanks!

finn-rpl commented 2 years ago

Hi there,

I'm running nanopolish eventalign on some of my data and the error code I'm getting is directing me to this thread.

The fast5 file is compressed with VBZ but the required plugin is not loaded. Please read the instructions here: https://github.com/nanoporetech/vbz_compression/issues/5

I am suspicious of this error as eventalign runs smoothly when given the example data found in the guide

Fixes attempted

I find it difficult to believe that my fast5 files are corrupted as I am able to complete basecalling and mapping without obvious errors seen when running fastQC. I think the program is right with its first error message and that it is decompressing the files incorrectly, hopefully you can help me in configuring my installation to appropriately process these files.

Thanks,

Finnlay Lambert

0x55555555 commented 2 years ago

Hi @finn-rpl ,

what version of the vbz plugin do you have installed? What OS (+ version) are you running?

Thanks,

finn-rpl commented 2 years ago

Hi George,

I'm running Ubuntu 20.04, but the issue is solved and the error was mine.

When I tried the fix detailed here, the link downloaded the mac version of the plugin rather than directing me to the releases page.

After installing the correct version the export HDF5_PLUGIN_PATH=/usr/local/hdf5/lib/plugin fix worked for me first try.

Thanks for your assistance, I only realised there were multiple distributions after you asked me which was the appropriate version.

Finnlay Lambert

0x55555555 commented 2 years ago

Perfect - Thanks for updating me!

vetmohit89 commented 1 year ago

Hello I am using nanoseq pipeline to do M6A analysis. I am using HPC cluster for analysis. I installed https://github.com/nanoporetech/vbz_compression/releases/download/v1.0.1/ont-vbz-hdf-plugin-1.0.1-Linux-x86_64.tar.gz in my conda virtual env. I used following command to run the pipeline nextflow run

nf-core/nanoseq --input samplesheet.csv --protocol directRNA --skip_demultiplexing -profile singularity -c vbz.config -r 3.1.0 --skip_fusion_analysis --skip_differential_analysis..

My custom config file has

process {
    withName: NANOPOLISH_INDEX_EVENTALIGN {
        container = 'https://depot.galaxyproject.org/singularity/nanopolish:0.14.0--h773013f_3'
    }
}

env {
    HDF5_PLUGIN_PATH = '/data/user/home/mbansal/.conda/envs/ont-vbz/hdf5/lib/plugin'
}

but I am keep getting error

[readdb] indexing fast5
[readdb] num reads: 427756, num reads with path to fast5: 427756
The fast5 file is compressed with VBZ but the required plugin is not loaded. Please read the instructions here: https://github.com/nanoporetech/vbz_compression/issues/5
HDF5-DIAG: Error detected in HDF5 (1.12.2) thread 1:
  #000: H5D.c line 276 in H5Dopen2(): invalid location identifier
    major: Invalid arguments to routine
    minor: Inappropriate type
HDF5-DIAG: Error detected in HDF5 (1.12.2) thread 1:
  #000: H5G.c line 502 in H5Gopen2(): invalid location identifier
    major: Invalid arguments to routine
    minor: Inappropriate type
0x55555555 commented 1 year ago

Hi @vetmohit89 ,

Can you confirm the architecture of the system you are running on? What is the content of the directory /data/user/home/mbansal/.conda/envs/ont-vbz/hdf5/lib/plugin ?

Have you tried running the same experiment outside of your container?