nanoporetech / ont_fast5_api

Oxford Nanopore Technologies fast5 API software
Other
144 stars 28 forks source link

Unable to copy object (file read failed) on parallel FS #84

Open matzmz opened 4 months ago

matzmz commented 4 months ago

I am currently working within an HPC environment where we utilize a BeegFS network storage system that is mounted at the /data/ mountpoint. Each node within the system also possesses a local disk formatted with the XFS filesystem.

When I attempt to execute a compression function on directory located on the local disk, the operation completes successfully. However, encountering the same scenario with a directory located in the /data/ directory results in errors (same file copied on /data/, checksum verified). Despite conducting a thorough examination of the BeegFS backend, we have been unable to identify any issues.

Below is the stack trace detailing the error encountered:

ERROR:root:Unable to copy object (file read failed: time = Mon Apr  8 10:28:53 2024
, filename = '/data/omitted/sample.fast5', file descriptor = 4, errno = 14, error message = 'Bad address', buf = 0x563d06f6e770, total read size = 512, bytes this sub-read = 512, bytes actually read = 18446744073709551615, offset = 0)
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/ont_fast5_api/conversion_tools/compress_fast5.py", line 59, in compress_file
    output_f5.add_existing_read(read, target_compression, sanitize=sanitize)
  File "/usr/local/lib/python3.10/dist-packages/ont_fast5_api/multi_fast5.py", line 82, in add_existing_read
    self._add_read_from_multi(read_to_add, target_compression, sanitize=sanitize)
  File "/usr/local/lib/python3.10/dist-packages/ont_fast5_api/multi_fast5.py", line 115, in _add_read_from_multi
    output_group.copy(read_to_add.handle[subgroup], subgroup)
  File "/usr/local/lib/python3.10/dist-packages/h5py/_hl/group.py", line 565, in copy
    h5o.copy(http://source.id , self._e(source_path), dest.id, self._e(dest_path),
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5o.pyx", line 217, in h5py.h5o.copy
RuntimeError: Unable to copy object (file read failed: time = Mon Apr  8 10:28:53 2024
, filename = '/data/omitted/sample.fast5', file descriptor = 4, errno = 14, error message = 'Bad address', buf = 0x563d06f6e770, total read size = 512, bytes this sub-read = 512, bytes actually read = 18446744073709551615, offset = 0)

I would greatly appreciate any assistance in understanding and resolving this error.

Thank you.