Closed hengjwj closed 7 months ago
Hey,
Did you get an error from multi_to_single_fast5
that you could share?
We will have a look. James
I tried to run hdf5 utilities on that file and that failed too.
h5dump in/FAN33287_8224016851906804b27023975e7e67f55f73adea_19.fast5
h5dump error: internal error (file ../../../../tools/h5dump/h5dump.c:line 1485)
This seems like a highly corrupted file. Surprisingly Guppy runs as you mentioned and just basecalls like 70 reads, but it is likely that it just continues while ignoring any errors. Thus it would be hard to trust any basecalls coming out from it either.
Are you facing this on many files or just this one?
I tried h5py on that file and got this
Python 3.10.6 (main, Mar 10 2023, 10:55:28) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import h5py
>>> f = h5py.File("FAN33287_8224016851906804b27023975e7e67f55f73adea_19.fast5", 'r')
>>> f
<HDF5 file "FAN33287_8224016851906804b27023975e7e67f55f73adea_19.fast5" (mode r)>
>>> f.keys()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/jamfer/.local/lib/python3.10/site-packages/h5py/_hl/base.py", line 386, in __str__
return "<KeysViewHDF5 {}>".format(list(self))
File "/usr/lib/python3.10/_collections_abc.py", line 881, in __iter__
yield from self._mapping
File "/home/jamfer/.local/lib/python3.10/site-packages/h5py/_hl/group.py", line 471, in __iter__
for x in self.id.__iter__():
File "h5py/h5g.pyx", line 128, in h5py.h5g.GroupIter.__next__
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5l.pyx", line 316, in h5py.h5l.LinkProxy.iterate
RuntimeError: Link iteration failed (incorrect metadata checksum after all read attempts)
Yea something is very wrong with that file
Hey,
Did you get an error from
multi_to_single_fast5
that you could share?We will have a look. James
Sorry for the late reply. Here's what I got from multi_to_single_fast5:
[s190075@hpc-amd004 workspace]$ multi_to_single_fast5 -i FAN33287_8224016851906804b27023975e7e67f55f73adea_19.fast5 -s singleread -t 16
ERROR:ont_fast5_api.conversion_tools.multi_to_single_fast5:Link iteration failed (incorrect metadatda checksum after all read attempts) | 0% ETA: --:--:--
Failed to copy files from: FAN33287_8224016851906804b27023975e7e67f55f73adea_19.fast5
| 1 of 1|##################################################################################################################################|100% Time: 0:00:00
I tried to run hdf5 utilities on that file and that failed too.
h5dump in/FAN33287_8224016851906804b27023975e7e67f55f73adea_19.fast5 h5dump error: internal error (file ../../../../tools/h5dump/h5dump.c:line 1485)
This seems like a highly corrupted file. Surprisingly Guppy runs as you mentioned and just basecalls like 70 reads, but it is likely that it just continues while ignoring any errors. Thus it would be hard to trust any basecalls coming out from it either.
Are you facing this on many files or just this one?
Just this one file. Ok, I think I'll just drop it then.
Thanks @Psy-Fer and @hasindu2008!
Closing this issue. Feel free to reopen or start a new issue. Glad to help.
Hi @hasindu2008,
I'm having difficulty converting one of my FAST5 files (number 20 of 182) to BLOW5:
I tried to convert it to single read to remove problematic reads but ONT's multi_to_single_fast5 failed. Strangely, however, Guppy managed to basecall the file without incident, although there were only 73 reads in the FASTQ generated (expected 4,000 reads). I used Guppy's --fast5_out to try to regenerate an intact fast5 but I got the same errors (actually, the above was from the re-generated fast5)
Do you have other ideas on how to salvage it? I've uploaded the fast5 from Guppy if you want to try: https://entuedu-my.sharepoint.com/:f:/g/personal/s190075_e_ntu_edu_sg/EumhbBBCZ3JBsMjK_4TiU6IB58mzsZG81Z9t6LV6HKrIzA?e=NfB79x
Joel