Closed SergeWielhouwer closed 6 months ago
We've seen errors like this in the medaka stitch
process previously with users, the cause has always been that the intermediate HDF files produced by medaka consensus
have become currupt. We've never managed to isolate the problem ourselves and reproduce it.
Unrelated to your errors, the low coverage is likely to mean that the results output from medaka are unstable; I would advise 20X as an absolute minimum and preferable at least 30-40X.
Thank you, I will try to run it again to see if this error occurs again. Otherwise, I will consider skipping medaka overall and directly do short read polishing on the assembly.
Describe the bug I am trying to polish a Spodoptera frugiperda genome assembly from Flye v2.9.2 through medaka with the command
medaka_consensus -i filtered_long_reads/105828-001-002_long.fastq.gz -d assembly/105828-001-002/flye/assembly.fasta \ -o assembly/105828-001-002/medaka_polished/ -t 32 -m r1041_e82_400bps_sup_v4.2.0 2>logs/medaka.105828-001-002.log
on a HPC cluster with SLURM job manager (200 GB ram reserved for job), but I encounter issues during the final stitching step (see below). I already tried restarting the tool after removing all medaka output files. Logging Please attach any relevant logging messages. (Use ``` before and after code blocks).From medaka.105828-001-002.log ``` Cannot import pyabpoa, some features may not be available. Cannot import pyabpoa, some features may not be available. Cannot import pyabpoa, some features may not be available. Cannot import pyabpoa, some features may not be available. Cannot import pyabpoa, some features may not be available. [12:10:09 - MdlStrTF] Successfully removed temporary files from /tmp/tmpsqp_qpv5. Cannot import pyabpoa, some features may not be available. [12:10:09 - MdlStrTF] Successfully removed temporary files from /tmp/tmpp03qnd0k. Cannot import pyabpoa, some features may not be available. [12:10:10 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:11 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:11 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:11 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:11 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:11 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:11 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:11 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:11 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:11 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:11 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:11 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:11 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:11 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:11 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:11 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:11 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:11 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:11 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:11 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:11 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:11 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:11 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:11 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:11 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:11 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:11 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:11 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:11 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:11 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:11 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:11 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:11 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:12 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:12 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:12 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:12 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:12 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:12 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:12 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:12 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:12 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:12 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:12 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:12 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:12 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:12 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:12 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:12 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:12 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:12 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:12 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:12 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:12 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:12 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:12 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:12 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:12 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:12 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:12 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:12 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:12 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:12 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:12 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:12 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:12 - DataIndx] Loaded 1/1 (100.00%) sample files. [12:10:12 - DataIndx] Loaded 1/1 (100.00%) sample files. concurrent.futures.process._RemoteTraceback: """ Traceback (most recent call last): File "/home/epi2melabs/conda/lib/python3.8/concurrent/futures/process.py", line 239, in _process_worker r = call_item.fn(*call_item.args, *call_item.kwargs) File "/home/epi2melabs/conda/lib/python3.8/concurrent/futures/process.py", line 198, in _process_chunk return [fn(args) for args in chunk] File "/home/epi2melabs/conda/lib/python3.8/concurrent/futures/process.py", line 198, in
return [fn(*args) for args in chunk]
File "/home/epi2melabs/conda/lib/python3.8/site-packages/medaka/stitch.py", line 106, in stitch_from_probs
return _stitch_samples(samples, label_scheme, region, min_depth)
File "/home/epi2melabs/conda/lib/python3.8/site-packages/medaka/stitch.py", line 60, in _stitch_samples
for s, is_last_in_contig, heuristic in data_gen:
File "/home/epi2melabs/conda/lib/python3.8/site-packages/medaka/common.py", line 543, in trim_samples_to_region
yield from samples
File "/home/epi2melabs/conda/lib/python3.8/site-packages/medaka/common.py", line 529, in _trim_ends
for sample, last, heuristic in samples:
File "/home/epi2melabs/conda/lib/python3.8/site-packages/medaka/common.py", line 513, in _trim_starts
for sample, last, heuristic in samples:
File "/home/epi2melabs/conda/lib/python3.8/site-packages/medaka/common.py", line 443, in trim_samples
s1 = next(sample_gen)
File "/home/epi2melabs/conda/lib/python3.8/site-packages/medaka/datastore.py", line 557, in yield_from_feature_files
yield self._ds.load_sample(key)
File "/home/epi2melabs/conda/lib/python3.8/site-packages/medaka/datastore.py", line 351, in load_sample
group = self.fh['{}/{}'.format(self._samplepath, key)]
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "/home/epi2melabs/conda/lib/python3.8/site-packages/h5py/_hl/group.py", line 357, in getitem
oid = h5o.open(self.id, self._e(name), lapl=self._lapl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5o.pyx", line 189, in h5py.h5o.open
KeyError: 'Unable to synchronously open object (bad heap free list)'
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/home/epi2melabs/conda/bin/medaka", line 8, in
sys.exit(main())
File "/home/epi2melabs/conda/lib/python3.8/site-packages/medaka/medaka.py", line 814, in main
args.func(args)
File "/home/epi2melabs/conda/lib/python3.8/site-packages/medaka/stitch.py", line 265, in stitch
contigs, gt = fill_gaps(contigs, args.draft, args.fill_char)
File "/home/epi2melabs/conda/lib/python3.8/site-packages/medaka/stitch.py", line 127, in fill_gaps
for info, sequence_parts, qualities in contigs:
File "/home/epi2melabs/conda/lib/python3.8/site-packages/medaka/stitch.py", line 175, in collapse_neighbours
contig = next(contigs)
File "/home/epi2melabs/conda/lib/python3.8/site-packages/medaka/stitch.py", line 243, in stitch_regions_parallel
yield from pieces
File "/home/epi2melabs/conda/lib/python3.8/concurrent/futures/process.py", line 484, in _chain_from_iterable_of_lists
for element in iterable:
File "/home/epi2melabs/conda/lib/python3.8/concurrent/futures/_base.py", line 619, in result_iterator
yield fs.pop().result()
File "/home/epi2melabs/conda/lib/python3.8/concurrent/futures/_base.py", line 444, in result
return self.get_result()
File "/home/epi2melabs/conda/lib/python3.8/concurrent/futures/_base.py", line 389, in get_result
raise self._exception
KeyError: 'Unable to synchronously open object (bad heap free list)'
```
From SLURM stdout log ``` TF_CPP_MIN_LOG_LEVEL is set to '3' Checking program versions This is medaka 1.11.1 Program Version Required Pass
bcftools 1.18 1.11 True
bgzip 1.18 1.11 True
minimap2 2.26 2.11 True
samtools 1.18 1.11 True
tabix 1.18 1.11 True
WARNING: Output assembly/105828-001-002/medaka_polished/ already exists, may use old results. Not aligning basecalls to draft, calls_to_draft.bam exists. Not running medaka consensus, consensus_probs.hdf exists. Failed to stitch consensus chunks. ```
Environment (if you do not have a GPU, write No GPU):
Additional context Two other samples successfully managed to complete all medaka steps, however these samples had 4035 and 3443 contigs to start from, while this sample is quite fragmented with 30756 contigs due to low genomic coverage (5-6X). The overall polishing took quite a bit longer (>2 days) than the other two samples.