Closed irenenewton closed 3 years ago
Hi @irenenewton -- thanks for letting us know. Could you try running the single file it complained about (Run_12_6_2020/Run_12_6_2020/Run_12_6_2020/20201206_2104_MN30516_FAO46609_3100efdb/fast5_pass/FAO46609_pass_83a97ca0_95.fast5
) to see if that crashes fast5_subset
? If it does, and you're ok with giving us the file, we can try and figure out exactly what's going on.
Interesting new error:
root@e9216b63e9b0:/# fast5_subset -i Run_12_6_2020/Run_12_6_2020/Run_12_6_2020/20201206_2104_MN30516_FAO46609_3100efdb/fast5_pass/FAO46609_pass_83a97ca0_95.fast5 -s Run_12_6_2020/Run_12_6_2020/Run_12_6_2020/20201206_2104_MN30516_FAO46609_3100efdb/ -l list_reads_mapped_to_virus.txt
Traceback (most recent call last):
File "/usr/local/bin/fast5_subset", line 8, in
Happy to share the file with you. Email me and I can provide a link.
Hi @irenenewton -- the error you're getting there is because the -i
argument to fast5_subset
expects a folder, not a file. If you put that file in its own folder it should work. I'll get in touch with you about the file though.
I was giving it a folder, originally, not a file. The original submission script pointed it to a directory (fast5_pass). When I get a second I can rerun the fast5 file you pointed to in its own dir. Here's the same error, when I've moved that 95.past5 file to its own dir:
root@e9216b63e9b0:/# fast5_subset -i test/ -s Run_12_6_2020/Run_12_6_2020/Run_12_6_2020/20201206_2104_MN30516_FAO46609_3100efdb/ -l list_reads_mapped_to_virus.txt
Traceback (most recent call last):
File "/usr/local/bin/fast5_subset", line 8, in
Yes, in the original submission script, absolutely! I was only talking about your second attempt. =)
Please email the file to support@nanoporetech.com and ask them to pass it to me.
Hi @irenenewton -- apologies for the delay. I've had a look at the particular file that was in your error messages (FAO46609_pass_83a97ca0_95.fast5
) and that file appears to be corrupt. We can definitely improve how we handle these files in ont-fast5-api
, but in the meantime you can remove that file from the ones you're using and (assuming there are no other corrupt fast5 files) your call to fast5_subset
should then work.
Running fast5_subset on a docker container with python 3, and ont-fast5-api installed (v 3.3.0) + all dependencies like so:
root@42aca274f9f3:/data# ./run_fast5.sh
Oddly, the script runs up until 2% of the reads have been extracted using the subset flatfile, then it fails with the following error:
DEBUG:h5py._conv:Creating converter from 5 to 3 | 0% ETA: --:--:-- Traceback (most recent call last): | 2% ETA: 1:02:08 File "/usr/local/lib/python3.8/dist-packages/ont_fast5_api/conversion_tools/fast5_subset.py", line 261, in extract_selected_reads output_f5.add_existing_read(read, target_compression=target_compression) File "/usr/local/lib/python3.8/dist-packages/ont_fast5_api/multi_fast5.py", line 82, in add_existing_read self._add_read_from_multi(read_to_add, target_compression, sanitize=sanitize) File "/usr/local/lib/python3.8/dist-packages/ont_fast5_api/multi_fast5.py", line 105, in _add_read_from_multi if read_to_add.run_id in self.run_id_map: File "/usr/local/lib/python3.8/dist-packages/ont_fast5_api/multi_fast5.py", line 69, in run_id_map for read in self.get_reads(): File "/usr/local/lib/python3.8/dist-packages/ont_fast5_api/multi_fast5.py", line 27, in get_reads yield Fast5Read(self, group_name[5:]) File "/usr/local/lib/python3.8/dist-packages/ont_fast5_api/fast5read.py", line 61, in init self.handle = parent.handle["read" + read_id] File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "/usr/local/lib/python3.8/dist-packages/h5py/_hl/group.py", line 288, in getitem oid = h5o.open(self.id, self._e(name), lapl=self._lapl) File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "h5py/h5o.pyx", line 190, in h5py.h5o.open KeyError: 'Unable to open object (bad object header version number)'
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/usr/local/bin/fast5_subset", line 8, in
sys.exit(main())
File "/usr/local/lib/python3.8/dist-packages/ont_fast5_api/conversion_tools/fast5_subset.py", line 326, in main
multifilter.run_batch()
File "/usr/local/lib/python3.8/dist-packages/ont_fast5_api/conversion_tools/fast5_subset.py", line 103, in run_batch
self._launch_sync_tasks()
File "/usr/local/lib/python3.8/dist-packages/ont_fast5_api/conversion_tools/fast5_subset.py", line 129, in _launch_sync_tasks
reads, out_file, in_file = extract_selected_reads(*args_tuple)
File "/usr/local/lib/python3.8/dist-packages/ont_fast5_api/conversion_tools/fast5_subset.py", line 269, in extract_selected_reads
raise ExtractionException(exception, output_file)
ont_fast5_api.conversion_tools.fast5_subset.ExtractionException: (KeyError("Error processing file Run_12_6_2020/Run_12_6_2020/Run_12_6_2020/20201206_2104_MN30516_FAO46609_3100efdb/fast5_pass/FAO46609_pass_83a97ca0_95.fast5: ('Unable to open object (bad object header version number)',)"), 'Run_12_6_2020/Run_12_6_2020/Run_12_6_2020/20201206_2104_MN30516_FAO46609_3100efdb/batch1.fast5')
root@42aca274f9f3:/data#