replikation / poreCov

SARS-CoV-2 workflow for nanopore sequence data
https://case-group.github.io/
GNU General Public License v3.0
39 stars 16 forks source link

Add vbz compression plugin to nanozoo/artic container #137

Closed iferres closed 3 years ago

iferres commented 3 years ago

Hi, me again :sweat_smile:

The latest version of GridION (21.05.8) includes the following change:

As pre-notified earlier this year (see here), VBZ compression for .fast5 and gzip for FASTQ are now the default settings

So to read fast5 files, nanopolish needs the vbz plugin which is not installed with the current artic version. This leads to the following error in the artic process:

HDF5-DIAG: Error detected in HDF5 (1.8.14) thread 140111133751040:
  #000: H5Dio.c line 173 in H5Dread(): can't read data
    major: Dataset
    minor: Read failed
  #001: H5Dio.c line 550 in H5D__read(): can't read data
    major: Dataset
    minor: Read failed
  #002: H5Dchunk.c line 1872 in H5D__chunk_read(): unable to read raw data chunk
    major: Low-level I/O
    minor: Read failed
  #003: H5Dchunk.c line 2902 in H5D__chunk_lock(): data pipeline read failed
    major: Data filters
    minor: Filter operation failed
  #004: H5Z.c line 1357 in H5Z_pipeline(): required filter 'vbz' is not registered
    major: Data filters
    minor: Read failed
  #005: H5PL.c line 298 in H5PL_load(): search in paths failed
    major: Plugin for dynamically loaded library
    minor: Can't get value
  #006: H5PL.c line 402 in H5PL__find(): can't open directory
    major: Plugin for dynamically loaded library
    minor: Can't open directory or file
The fast5 file is compressed with VBZ but the required plugin is not loaded. Please read the instructions here: https://github.com/nanoporetech/vbz_compression/issues/5

To solve this issue, you have to download the plugin and add an environment variable. I use singularity mostly, and I solved it by adding the following lines to the definition file:

%post
(...)
   wget -O /opt/ont-vbz-hdf-plugin-1.0.1-Linux-x86_64.tar.gz https://github.com/nanoporetech/vbz_compression/releases/download/v1.0.1/ont-vbz-hdf-plugin-1.0.1-Linux-x86_64.tar.gz
   tar -xzf /opt/ont-vbz-hdf-plugin-1.0.1-Linux-x86_64.tar.gz -C /opt/

%environment
(...)
   export HDF5_PLUGIN_PATH=/opt/ont-vbz-hdf-plugin-1.0.1-Linux/usr/local/hdf5/lib/plugin/

..but it may depend on the base image you use to build the containers. I suppose next releases will include the plugin by default, but for now we can patch it with this change.

By the way, we made it run with a local singularity container for the artic labeled processes, and makes it until the end when it fails to generate the summary report:

[c3/6214a6] process > create_summary_report_wf:summary_report_default (1) [100%] 1 of 1, failed: 1 ✔                   
[c3/6214a6] NOTE: Process `create_summary_report_wf:summary_report_default (1)` terminated with an error exit status ( 
1) -- Error is ignored 

...which I think it is because the script try to parse a version from a path to my local singularity image, which of course doesn't exist (my config file has the following line: withLabel: artic { container = '/mnt/ubi/iferres/singularity_images/artic_vbz-plugin.sif' }). Am I right?

replikation commented 3 years ago

@iferres takes a bit longer, having some installation issues with artic. (conda resolving stuck)

iferres commented 3 years ago

No problem!