nanoporetech / vbz_compression

VBZ compression plugin for nanopore signal data
https://nanoporetech.com/
Mozilla Public License 2.0
39 stars 10 forks source link

Bulk fast5 recompression on MacOS #14

Open pre-mRNA opened 3 years ago

pre-mRNA commented 3 years ago

Hello,

I am having several issues working with VBZ-compressed bulk fast5 files on MacOS.

  1. Instructions to install the vbz plugin on MacOS are extremely poorly documented.

  2. The ont_fast5_api is advertised as a tool to manipulate fast5s, but doesn't work to recompress bulk fast5 file. However, this isn't stated anywhere in the documentation, further adding to the confusion.

  3. My specific issue, after managing to install the vbz_compression plugin:

I am working with bulk fast5 files on MacOS, and trying to recompress from VBZ to GZIP so I can analyse the files using my own pipelines.

The command I am using is:

h5repack -f GZIP=1 input.fast5 output.fast5

This returns the following errors:

Warning: dataset <#dataset> cannot be read, user defined filter is not available

Thus, I am unable to recompress the files.

Any help would be appreciated - thanks in advance.

0x55555555 commented 3 years ago

Hello @pre-mRNA ,

Have you installed the osx pkg installer: https://github.com/nanoporetech/vbz_compression/releases/download/v1.0.1/ont-vbz-hdf-plugin-1.0.1-Darwin.pkg ?

Does the HdfView tool work to open a standard multifast5 file with the above installed?

fbrennen commented 3 years ago

Hi @pre-mRNA -- bulk fast5 files are primarily intended for developers and for internal use. They aren't covered in ont-fast5-api for precisely that reason, and in a perfect world we would not give bulk files the .fast5 extension at all. I'm sorry this isn't clear from ont-fast5-api's documentation.