bonsai-team / matam

Mapping-Assisted Targeted-Assembly for Metagenomics
GNU Affero General Public License v3.0
19 stars 9 forks source link

Memory Error #96

Closed TakacsBertalan closed 4 years ago

TakacsBertalan commented 4 years ago

During running MATAM assembly, I received a memory error at the scaffolding step: **"Traceback (most recent call last): File "/home/deltagene/anaconda3/opt/matam-v1.5.3/scripts/generate_scaffolding_blast.py", line 18, in read_tab_file_handle_sorted tab = l.split() MemoryError

During handling of the above exception, another exception occurred:

MemoryError Killed CRITICAL - The last command returns a non-zero return code: 137"** My data is a 144 MB fastq file, in silico generated from the 16S rRNA data of 30 specii, 10000 reads/species from a database which is different from the MATAM reference database. The hard drive is a 2 TB HDD. MATAM used 12 cpu threads, the max memory was set to 30000.(see the appended log file) matam_memory_error.log It seems like that the root of the problem is a ~650 GB .sam file in the workdir.

Is this a bug or if it is not, is there something I can do to complete the run? Best regards, Bertalan Takács

ppericard commented 4 years ago

Hi @TakacsBertalan,

This is not something that we want to happen, but it might be possible this is an expected behavior given your input files and our algorithm for scaffolding. For splitting the SAM file we don't limit Python memory usage as we need it to load all alignments from the same contig.

Are you using a job manager on a cluster (something like SGE or SLURM)? The memory error and subsequent kill do not come from MATAM but probably from the job manager because it identified that MATAM used more RAM than what you asked for.

Could you maybe try asking the job manager for more RAM for the scaffolding step (given your ~100000 contigs and 650GB SAM file, I would maybe start trying with 100GB RAM, and increase if needed)?

In the future, we will try to implement more limitations in MATAM RAM usage. Right now, the --max_memory option is not a hard limit but more a suggestion that we try to implement in every MATAM module.

Pierre

TakacsBertalan commented 4 years ago

Hi Pierre! Thank you for the fast reply! I am using a single pc which has 64 GB RAM, so my I am restricted to this. After getting this error the first time, I tried doubling the max_memory, to 60000, received the same error message again, but without the part about the python script: "Killed CRITICAL - The last command returns a non-zero return code: 137" Given that this is almost the maximal memory on my computer, is it safe to assume that I can not finish this assembly with my current resources? Thanks, Bertalan Takács

ppericard commented 4 years ago

I think your assumption is correct and you might not be able to finish this assembly given your resources. However, this behavior is a bit unusual because that would mean that you have a contig that is able to align against thousands or millions of rRNA reference sequences. You would probably need between 100 to 200GB RAM to finish this run.

It's a shame that it's not working on your data, because reading that much alignments for a single contig is not critical for MATAM to assemble correctly the final sequences. We probably could reduce a lot the memory usage of this step. We already were aware of that problem and we're going to make it a priority to develop a fix for this.

By any chance, could you make available to us your input fastq file so we could use it as a benchmark in the future?

Pierre

TakacsBertalan commented 4 years ago

Certainly. The file is in my public repository "in_silico_data". Thank you for your help! Bertalan Takács

ppericard commented 4 years ago

Closing the issue for now. We have open issue #97 about improving memory usage of the scaffolding step.