a-slide / pycoQC

pycoQC computes metrics and generates Interactive QC plots from the sequencing summary report generated by Oxford Nanopore technologies basecaller (Albacore/Guppy)
https://a-slide.github.io/pycoQC/
GNU General Public License v3.0
258 stars 41 forks source link

Fast5_to_seq_summary error #105

Closed apeltzer closed 4 years ago

apeltzer commented 4 years ago

Describe the bug

Running like this:

Fast5_to_seq_summary -f FAST5 -t 8 -s './sequencing_summary.txt' --verbose_level 2
pycoQC -f "sequencing_summary.txt"  -o pycoQC_BGLUMAE.html -j pycoQC_BGLUMAE.json

Results in:


  [WORKER_02] Start processing fast5 files
  [WORKER_03] Start processing fast5 files
  [WORKER_04] Start processing fast5 files
  [WORKER_05] Start processing fast5 files

  An error occured. All processes were killed

  Traceback (most recent call last):
    File "/opt/conda/envs/nf-core-bacass-1.1.0dev/bin/Fast5_to_seq_summary", line 12, in <module>
      sys.exit(main_Fast5_to_seq_summary())
    File "/opt/conda/envs/nf-core-bacass-1.1.0dev/lib/python3.7/site-packages/pycoQC/__main__.py", line 165, in main_Fast5_to_seq_summary
      verbose_level = args.verbose_level)
    File "/opt/conda/envs/nf-core-bacass-1.1.0dev/lib/python3.7/site-packages/pycoQC/Fast5_to_seq_summary.py", line 162, in __init__
      raise E
    File "/opt/conda/envs/nf-core-bacass-1.1.0dev/lib/python3.7/site-packages/pycoQC/Fast5_to_seq_summary.py", line 152, in __init__
      raise pycoQCError(tb)
  pycoQC.common.pycoQCError: Traceback (most recent call last):
    File "/opt/conda/envs/nf-core-bacass-1.1.0dev/lib/python3.7/site-packages/pycoQC/Fast5_to_seq_summary.py", line 206, in _read_fast5
      "raw_read" : "/Raw/Reads/{}/".format(list(h5_fp["/Raw/Reads"].keys())[0]),
    File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
    File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
    File "/opt/conda/envs/nf-core-bacass-1.1.0dev/lib/python3.7/site-packages/h5py/_hl/group.py", line 262, in __getitem__
      oid = h5o.open(self.id, self._e(name), lapl=self._lapl)
    File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
    File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
    File "h5py/h5o.pyx", line 190, in h5py.h5o.open
  KeyError: 'Unable to open object (component not found)'
moldovannorbert commented 4 years ago

I have the same issue under Ubuntu 16.04.6 LTS pycoQC 2.5.0.17 installed using conda

a-slide commented 4 years ago

Hi both. Are you using multi-fast5 as input for Fast5_to_seq_summary ?

apeltzer commented 4 years ago

Missing the experience to comment on this properly, so I'm asking here: I have many FAST5 files in a folder, does that mean this is multi-fast5 input automatically? The help text talking about multi-fast5 reads a bit differently, which is why I'm not sure "many reads per fast5 file..."

Maybe I am - can check when I know what that actually means 😆

a-slide commented 4 years ago

Sry, No Fast5 format is a recent version (1year old) of ONT raw file format. Essentially it means that each file contains multiple reads in it as opposed to the previous version where 1 file = 1 read. You can check by opening a file with for example https://www.hdfgroup.org/downloads/hdfview/

I haven't implemented multifast5 support in Fast5_to_seq_summary and I don't have any plans to do it any soon. You can generate a new seq summary file by re-basecalling your data with Guppy.

a-slide commented 4 years ago

Can you confirm this is a multi fast5 issue ?

apeltzer commented 4 years ago

Sorry, will check tomorrow at work - I guess thats probably the reason 😓 Demultiplexing was done with a newer Guppy version at least ...

apeltzer commented 4 years ago

Looks like it - thanks!

Gerlex89 commented 2 years ago

Hi. I am having the same issue while providing a list of FAST5 files like the one below, although in my case it does not start the analysis. It jumps straight to the error.

The version is the latest v2.5.2 installed with Conda, which now includes the option.

~/input/fast5$ ls
GXB01322_20181213_FAK35814_GA20000_sequencing_run_Run00012_MIN106_RBK004_17047_0.fast5
GXB01322_20181213_FAK35814_GA20000_sequencing_run_Run00012_MIN106_RBK004_17047_59.fast5
GXB01322_20181213_FAK35814_GA20000_sequencing_run_Run00012_MIN106_RBK004_17047_1.fast5
GXB01322_20181213_FAK35814_GA20000_sequencing_run_Run00012_MIN106_RBK004_17047_6.fast5
GXB01322_20181213_FAK35814_GA20000_sequencing_run_Run00012_MIN106_RBK004_17047_10.fast5
GXB01322_20181213_FAK35814_GA20000_sequencing_run_Run00012_MIN106_RBK004_17047_60.fast5
...