schmeing / gapless

Gapless provides combined scaffolding, gap-closing and assembly correction with long reads
MIT License
32 stars 4 forks source link

pipeline crashed : scaffold (numpy.core._exceptions._ArrayMemoryError) #14

Open jaehakson opened 11 months ago

jaehakson commented 11 months ago

I had an issue on the scaffolding step; probably allocating memory.... But I do not understand because the node I used is allocated 256Gb memory.

Below is the error message I have got from 'gapless_scaffold.log' without an option '--largeGenome' 0:00:05.183697 Reading in original assembly 0:00:06.559198 Loading repeats 0:00:13.782595 Filtering mappings 0:00:54.308255 Search for possible break points Traceback (most recent call last): File "/home/js3054/.local/bin/gapless.py", line 13362, in main(sys.argv[1:]) File "/home/js3054/.local/bin/gapless.py", line 13189, in main GaplessScaffold(args[0], args[1], args[2], min_mapq, min_mapping_length, min_length_contig_break, large_reads, large_contigs, prefix, stats) File "/home/js3054/.local/bin/gapless.py", line 9107, in GaplessScaffold mappings = UpdateMappingsToContigParts(mappings, contig_parts, min_mapping_length, max_dist_contig_end, min_extension) File "/home/js3054/.local/bin/gapless.py", line 1182, in UpdateMappingsToContigParts mappings = GetContigPartFromTargetID(mappings, contig_parts, min_mapping_length) File "/home/js3054/.local/bin/gapless.py", line 1177, in GetContigPartFromTargetID mappings = mappings[(mappings['t_start']+min_mapping_length < mappings['part_end']) & (mappings['t_end']-min_mapping_length > mappings['part_start'])].copy() File "/home/js3054/miniconda3/envs/gapless/lib/python3.10/site-packages/pandas/core/frame.py", line 3496, in getitem return self._getitem_bool_array(key) File "/home/js3054/miniconda3/envs/gapless/lib/python3.10/site-packages/pandas/core/frame.py", line 3551, in _getitem_bool_array return self._take_with_is_copy(indexer, axis=0) File "/home/js3054/miniconda3/envs/gapless/lib/python3.10/site-packages/pandas/core/generic.py", line 3716, in _take_with_is_copy result = self.take(indices=indices, axis=axis) File "/home/js3054/miniconda3/envs/gapless/lib/python3.10/site-packages/pandas/core/generic.py", line 3701, in take self._consolidate_inplace() File "/home/js3054/miniconda3/envs/gapless/lib/python3.10/site-packages/pandas/core/generic.py", line 5653, in _consolidate_inplace self._protect_consolidate(f) File "/home/js3054/miniconda3/envs/gapless/lib/python3.10/site-packages/pandas/core/generic.py", line 5641, in _protect_consolidate result = f() File "/home/js3054/miniconda3/envs/gapless/lib/python3.10/site-packages/pandas/core/generic.py", line 5651, in f self._mgr = self._mgr.consolidate() File "/home/js3054/miniconda3/envs/gapless/lib/python3.10/site-packages/pandas/core/internals/managers.py", line 631, in consolidate bm._consolidate_inplace() File "/home/js3054/miniconda3/envs/gapless/lib/python3.10/site-packages/pandas/core/internals/managers.py", line 1685, in _consolidate_inplace self.blocks = tuple(_consolidate(self.blocks)) File "/home/js3054/miniconda3/envs/gapless/lib/python3.10/site-packages/pandas/core/internals/managers.py", line 2084, in _consolidate merged_blocks = _merge_blocks( File "/home/js3054/miniconda3/envs/gapless/lib/python3.10/site-packages/pandas/core/internals/managers.py", line 2118, in _merge_blocks new_values = new_values[argsort] numpy.core._exceptions._ArrayMemoryError: Unable to allocate 74.3 GiB for an array with shape (9, 1108714767) and data type int64

Below is the error message I have got from 'gapless_scaffold.log' with an option '--largeGenome' 0:00:04.748569 Reading in original assembly 0:00:06.579180 Loading repeats 0:00:14.159905 Filtering mappings 0:00:55.106829 Search for possible break points Traceback (most recent call last): File "/home/js3054/.local/bin/gapless.py", line 13362, in main(sys.argv[1:]) File "/home/js3054/.local/bin/gapless.py", line 13189, in main GaplessScaffold(args[0], args[1], args[2], min_mapq, min_mapping_length, min_length_contig_break, large_reads, large_contigs, prefix, stats) File "/home/js3054/.local/bin/gapless.py", line 9098, in GaplessScaffold break_groups, repeat_removal, spurious_break_indexes, non_informative_mappings, unconnected_breaks = FindBreakPoints(mappings, contigs, covered_regions, repeats, max_dist_contig_end, min_mapping_length, min_length_contig_break, max_break_point_distance, min_num_reads, min_extension, merge_block_length, org_scaffold_trust, cov_probs, prob_factor, allow_same_contig_breaks, prematurity_threshold, pdf) File "/home/js3054/.local/bin/gapless.py", line 975, in FindBreakPoints break_points, unconnected_break_points = CountAndApplyBreakVetos(break_points, mappings, pot_breaks, bp_ext_len, cov_probs, covered_regions, repeats, max_dist_contig_end, max_break_point_distance, min_mapping_length, min_num_reads, min_length_contig_break, prob_factor, merge_block_length, prematurity_threshold) File "/home/js3054/.local/bin/gapless.py", line 843, in CountAndApplyBreakVetos breaks_in_repeats = breaks_in_repeats[ (breaks_in_repeats['lrep_len'] > 0) | (breaks_in_repeats['rrep_len'] > 0) ].copy() File "/home/js3054/miniconda3/envs/gapless/lib/python3.10/site-packages/pandas/core/frame.py", line 3496, in getitem return self._getitem_bool_array(key) File "/home/js3054/miniconda3/envs/gapless/lib/python3.10/site-packages/pandas/core/frame.py", line 3551, in _getitem_bool_array return self._take_with_is_copy(indexer, axis=0) File "/home/js3054/miniconda3/envs/gapless/lib/python3.10/site-packages/pandas/core/generic.py", line 3716, in _take_with_is_copy result = self.take(indices=indices, axis=axis) File "/home/js3054/miniconda3/envs/gapless/lib/python3.10/site-packages/pandas/core/generic.py", line 3701, in take self._consolidate_inplace() File "/home/js3054/miniconda3/envs/gapless/lib/python3.10/site-packages/pandas/core/generic.py", line 5653, in _consolidate_inplace self._protect_consolidate(f) File "/home/js3054/miniconda3/envs/gapless/lib/python3.10/site-packages/pandas/core/generic.py", line 5641, in _protect_consolidate result = f() File "/home/js3054/miniconda3/envs/gapless/lib/python3.10/site-packages/pandas/core/generic.py", line 5651, in f self._mgr = self._mgr.consolidate() File "/home/js3054/miniconda3/envs/gapless/lib/python3.10/site-packages/pandas/core/internals/managers.py", line 631, in consolidate bm._consolidate_inplace() File "/home/js3054/miniconda3/envs/gapless/lib/python3.10/site-packages/pandas/core/internals/managers.py", line 1685, in _consolidate_inplace self.blocks = tuple(_consolidate(self.blocks)) File "/home/js3054/miniconda3/envs/gapless/lib/python3.10/site-packages/pandas/core/internals/managers.py", line 2084, in _consolidate merged_blocks = _merge_blocks( File "/home/js3054/miniconda3/envs/gapless/lib/python3.10/site-packages/pandas/core/internals/managers.py", line 2118, in _merge_blocks new_values = new_values[argsort] numpy.core._exceptions._ArrayMemoryError: Unable to allocate 98.7 GiB for an array with shape (6, 2207565873) and data type int64