MemoryError - Githubissues

vanbie commented 2 years ago

Describe the bug An error occured when I was trying to assemble corrected data. Wonder if it was an issue about parameter setting.

Error message hostname

hostname cd /root/nd/NextDenovo/Mo/Mo_out/02.cns_align/01.split_seed.sh.work/split_seed0
cd /root/nd/NextDenovo/Mo/Mo_out/02.cns_align/01.split_seed.sh.work/split_seed0 time /usr/bin/python3 /root/nd/NextDenovo/lib/split_cns.py -f /root/nd/NextDenovo/Mo/input.fofn -l 37491 -c 6
time /usr/bin/python3 /root/nd/NextDenovo/lib/split_cns.py -f /root/nd/NextDenovo/Mo/input.fofn -l 37491 -c 6 [INFO] 2021-08-11 22:39:29,574 Split step options: [INFO] 2021-08-11 22:39:29,574 Namespace(count=6, fofn='/root/nd/NextDenovo/Mo/input.fofn', index=True, min_len=37491, outdir='./', rename=True) Traceback (most recent call last): File "/root/nd/NextDenovo/lib/split_cns.py", line 155, in main(args) File "/root/nd/NextDenovo/lib/split_cns.py", line 129, in main f.cutf(args.count, rn = args.rename, ml = args.min_len, pdir = args.outdir, index = args.index) File "/root/nd/NextDenovo/lib/split_cns.py", line 108, in cutf print('>%d %d %f pid=%s\n%s' % (t, lens, 1, name, seq), file=fa_files[i]) MemoryError Command exited with non-zero status 1 90.50user 132.68system 4:47.14elapsed 77%CPU (0avgtext+0avgdata 245949344maxresident)k 48378432inputs+8outputs (119major+61515086minor)pagefaults 0swaps

Genome characteristics genome size=490m heterozygous rate=1.3% repeat content=58%

Input data Total base count=62880679007bp sequencing depth=129, average/N50 read length=30172

Config file [General] job_type = local job_prefix = nextDenovo task = assemble # 'all', 'correct', 'assemble' rewrite = yes # yes/no deltmp = yes rerun = 3 parallel_jobs = 8 input_type = corrected read_type = ont input_fofn = ./input.fofn workdir = M_out

genome_size = 485m

[assemble_option] minimap2_options_cns = -t 4 nextgraph_options = -a 1

Operating system Ubuntu 18.04 64bit

GCC gcc version 7.5.0

Python Python 3.6.9

NextDenovo nextDenovo v2.4.0

To Reproduce (Optional) none

Additional context (Optional) 32core 256G server

moold commented 2 years ago

It seems you have too much data, I think you can run NextDenovo with raw data (uncorrected data), which may run faster. Regarding the error you mentioned, actually, I do not know why the print expression causes MemoryError, I need more time to figure it out.

vanbie commented 2 years ago

It seems you have too much data, I think you can run NextDenovo with raw data (uncorrected data), which may run faster. Regarding the error you mentioned, actually, I do not know why the print expression causes MemoryError, I need more time to figure it out.

Thanks. Because the data was released in corrected reads, so I can only download the clean data. The original report used NextDenovo for analyzing as well, but did not mentioned too much details.

Nextomics / NextDenovo

MemoryError #122