Nextomics / NextDenovo

Fast and accurate de novo assembler for long reads
GNU General Public License v3.0
350 stars 52 forks source link

MemoryError #122

Open vanbie opened 2 years ago

vanbie commented 2 years ago

Describe the bug An error occured when I was trying to assemble corrected data. Wonder if it was an issue about parameter setting.

Error message hostname

Genome characteristics genome size=490m heterozygous rate=1.3% repeat content=58%

Input data Total base count=62880679007bp sequencing depth=129, average/N50 read length=30172

Config file [General] job_type = local job_prefix = nextDenovo task = assemble # 'all', 'correct', 'assemble' rewrite = yes # yes/no deltmp = yes rerun = 3 parallel_jobs = 8 input_type = corrected read_type = ont input_fofn = ./input.fofn workdir = M_out

genome_size = 485m

[assemble_option] minimap2_options_cns = -t 4 nextgraph_options = -a 1

Operating system Ubuntu 18.04 64bit

GCC gcc version 7.5.0

Python Python 3.6.9

NextDenovo nextDenovo v2.4.0

To Reproduce (Optional) none

Additional context (Optional) 32core 256G server

moold commented 2 years ago

It seems you have too much data, I think you can run NextDenovo with raw data (uncorrected data), which may run faster. Regarding the error you mentioned, actually, I do not know why the print expression causes MemoryError, I need more time to figure it out.

vanbie commented 2 years ago

It seems you have too much data, I think you can run NextDenovo with raw data (uncorrected data), which may run faster. Regarding the error you mentioned, actually, I do not know why the print expression causes MemoryError, I need more time to figure it out.

Thanks. Because the data was released in corrected reads, so I can only download the clean data. The original report used NextDenovo for analyzing as well, but did not mentioned too much details.