Kinggerm / GetOrganelle

Organelle Genome Assembly Toolkit (Chloroplast/Mitocondrial/ITS)
GNU General Public License v3.0
267 stars 51 forks source link

MemoryError #60

Closed bioramg closed 3 years ago

bioramg commented 3 years ago

Hello, I am doing the plant mitochondrial genome assembly with the following commands. But I am getting a MemoryError issue. The same parameter has used for another mitochondrial genome assembly.

get_organelle_from_reads.py -1forward_paired.fq.gz -2 reverse_paired.fq.gz -o get_organelle/ -R 50 -P 1000000 -F embplant_mt --memory-save -w 95 --reduce-reads-for-coverage inf

log.txt

2020-11-18 09:37:23,901 - INFO: Counting read qualities ... 2020-11-18 09:37:24,110 - INFO: Identified quality encoding format = Illumina 1.8+ 2020-11-18 09:37:24,111 - INFO: Trimming bases with qualities (0.00%): 33..33 ! 2020-11-18 09:37:24,175 - INFO: Mean error rate = 0.0146 2020-11-18 09:37:24,176 - INFO: Counting read lengths ... 2020-11-18 09:41:21,922 - INFO: Mean = 151.0 bp, maximum = 151 bp. 2020-11-18 09:41:21,922 - INFO: Reads used = 72894963+72894963 2020-11-18 09:41:21,922 - INFO: Pre-reading fastq finished.

2020-11-18 09:41:21,922 - INFO: Making seed reads ... 2020-11-18 09:41:21,952 - INFO: Seed bowtie2 index existed! 2020-11-18 09:41:21,952 - INFO: Mapping reads to seed bowtie2 index ... 2020-11-18 10:14:08,892 - INFO: Mapping finished. 2020-11-18 10:14:08,893 - INFO: Seed reads made: ./../../raman/cma_mt/get_organelle/seed/embplant_mt.initial.fq (13572326 bytes) 2020-11-18 10:14:08,916 - INFO: Making seed reads finished.

2020-11-18 10:14:08,916 - INFO: Checking seed reads and parameters ... 2020-11-18 10:14:11,177 - INFO: Estimated embplant_mt-hitting base-coverage = 133.73 2020-11-18 10:14:11,665 - INFO: Setting '--max-extending-len inf' 2020-11-18 10:14:11,725 - INFO: Checking seed reads and parameters finished.

2020-11-18 10:14:11,725 - INFO: Making read index ... 2020-11-18 10:35:14,221 - INFO: Mem 19.971 G, 145789926 reads 2020-11-18 10:35:23,848 - INFO: Making read index finished.

2020-11-18 10:35:23,849 - INFO: Extending ... 2020-11-18 10:35:23,849 - INFO: Adding initial words ... 2020-11-18 10:35:25,549 - INFO: AW 1414726 2020-11-18 10:56:53,068 - INFO: Round 1: 145789926/145789926 AI 2547987 AW 20138238 Mem 4.131 2020-11-18 11:18:14,011 - INFO: Round 2: 145789926/145789926 AI 4002492 AW 29878554 Mem 5.502 2020-11-18 11:39:52,946 - INFO: Round 3: 145789926/145789926 AI 4844430 AW 49387612 Mem 9.185 2020-11-18 12:02:22,295 - INFO: Round 4: 145789926/145789926 AI 6248245 AW 85261724 Mem 16.365 2020-11-18 12:18:09,770 - ERROR: ound 5: 79960513/145789926 AI 8101461 AW 131499852 Traceback (most recent call last): File "/home/raman/anaconda3/bin/get_organelle_from_reads.py", line 3879, in main File "/home/raman/anaconda3/bin/get_organelle_from_reads.py", line 2371, in extending_no_lim File "/home/raman/anaconda3/bin/get_organelle_from_reads.py", line 2217, in File "/home/raman/anaconda3/lib/python3.7/codecs.py", line 322, in decode MemoryError --- Logging error --- Traceback (most recent call last): File "/home/raman/anaconda3/bin/get_organelle_from_reads.py", line 3879, in main File "/home/raman/anaconda3/bin/get_organelle_from_reads.py", line 2371, in extending_no_lim File "/home/raman/anaconda3/bin/get_organelle_from_reads.py", line 2217, in File "/home/raman/anaconda3/lib/python3.7/codecs.py", line 322, in decode MemoryError

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/raman/anaconda3/lib/python3.7/logging/init.py", line 1025, in emit File "/home/raman/anaconda3/lib/python3.7/logging/init.py", line 869, in format File "/home/raman/anaconda3/lib/python3.7/logging/init.py", line 610, in format File "/home/raman/anaconda3/lib/python3.7/logging/init.py", line 550, in formatTime MemoryError Call stack: File "/home/raman/anaconda3/bin/get_organelle_from_reads.py", line 4045, in File "/home/raman/anaconda3/bin/get_organelle_from_reads.py", line 4036, in main Message: '' Arguments: () Traceback (most recent call last): File "/home/raman/anaconda3/bin/get_organelle_from_reads.py", line 3879, in main echo_step=echo_step, log_handler=log_handler) File "/home/raman/anaconda3/bin/get_organelle_from_reads.py", line 2371, in extending_no_lim this_c_seq = next(reads_generator) File "/home/raman/anaconda3/bin/get_organelle_from_reads.py", line 2217, in reads_generator = (this_read.strip() for this_read in File "/home/raman/anaconda3/lib/python3.7/codecs.py", line 322, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) MemoryError

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/raman/anaconda3/bin/get_organelle_from_reads.py", line 4045, in main() File "/home/raman/anaconda3/bin/get_organelle_from_reads.py", line 4037, in main log_handler = simple_log(log_handler, out_base, prefix=options.prefix + "get_org.") File "/home/raman/anaconda3/lib/python3.7/site-packages/GetOrganelleLib/pipe_control_func.py", line 69, in simple_log logfile = logging.FileHandler(os.path.join(output_base, prefix + 'log.txt'), mode='a') File "/home/raman/anaconda3/lib/python3.7/logging/init.py", line 1087, in init StreamHandler.init(self, self._open()) File "/home/raman/anaconda3/lib/python3.7/logging/init.py", line 1116, in _open return open(self.baseFilename, self.mode, encoding=self.encoding) MemoryError

Kinggerm commented 3 years ago

It's hitting you machine/environment's memory usage limit, even using --memory-save mode. Please increase the available memory or the word size.

Also keep in mind that given the relative low estimated base coverage, too large word size may risk incompleteness.

bioramg commented 3 years ago

Thank you. It's working fine now

sabrtoothcat commented 2 years ago

Hi, I have the same problem as above. It presents a memory error after ROUND 3. I have increased my memory to 128gb but it is still giving me the same error. Please advise.

get_organelle_from_reads.py -1forward_paired.fq.gz -2 reverse_paired.fq.gz -o mitochondria_output6 -F embplant_mt -R 50 -k 21,45,65,85,105 --verbose --keep-temp

../

2021-10-21 13:36:09,253 - INFO: Checking seed reads and parameters ... 2021-10-21 13:36:09,253 - INFO: The automatically-estimated parameter(s) do not ensure the best choice(s). 2021-10-21 13:36:09,253 - INFO: If the result graph is not a circular organelle genome, 2021-10-21 13:36:09,253 - INFO: you could adjust the value(s) of '-w'/'-R' for another new run. 2021-10-21 13:36:24,922 - INFO: Pre-assembling mapped reads ... 2021-10-21 13:36:24,960 - INFO: spades.py -t 24 -s mitochondria_output6/seed/embplant_mt.initial.fq -k 45 --only-assembler -o /mnt/lustre/users/clee/outputs/mitochondria_output6/seed/embplant_mt.initial.fq.spad$2021-10-21 13:36:55,235 - INFO: /apps/chpc/bio/python/3.7.4_gcc610/lib/python3.7/site-packages/GetOrganelle-1.7.5-py3.7.egg/EGG-INFO/scripts/slim_graph.py --verbose --log -t 24 --wrapper /mnt/lustre/users/clee$2021-10-21 13:41:30,369 - INFO: Pre-assembling mapped reads finished. 2021-10-21 13:41:30,370 - INFO: Estimated embplant_mt-hitting base-coverage = 193.33 2021-10-21 13:41:30,813 - INFO: Estimated word size(s): 77 2021-10-21 13:41:30,813 - INFO: Setting '-w 77' 2021-10-21 13:41:30,814 - INFO: Setting '--max-extending-len inf' 2021-10-21 13:41:31,618 - INFO: Checking seed reads and parameters finished.

2021-10-21 13:41:31,618 - INFO: Making read index ... 2021-10-21 14:04:13,054 - INFO: Mem 21.338 G, 138858829 candidates in all 150000000 reads 2021-10-21 14:04:13,422 - INFO: Pre-grouping reads ... 2021-10-21 14:04:13,423 - INFO: Setting '--pre-w 77' 2021-10-21 14:04:26,812 - INFO: Mem 19.97 G, 200000/1625482 used/duplicated 2021-10-21 14:05:17,510 - INFO: Mem 20.7 G, 6422 groups made. 2021-10-21 14:05:54,666 - INFO: Making read index finished.

2021-10-21 14:05:54,668 - INFO: Extending ... 2021-10-21 14:05:54,668 - INFO: Adding initial words ... 2021-10-21 14:06:08,000 - INFO: AW 12735358 2021-10-21 14:22:00,384 - INFO: Round 1: 138858829/138858829 AI 4184650 AW 78464614 Mem 12.582 2021-10-21 14:42:17,492 - INFO: Round 2: 138858829/138858829 AI 15724477 AW 307824044 Mem 47.905 2021-10-21 15:04:42,641 - INFO: Round 3: 138858829/138858829 AI 27006080 AW 555018070 Mem 87.664 2021-10-21 15:16:40,002 - ERROR: Traceback (most recent call last):

sabrtoothcat commented 2 years ago

Traceback (most recent call last): File "/apps/chpc/bio/python/3.7.4_gcc610/lib/python3.7/site-packages/GetOrganelle-1.7.5-py3.7.egg/EGG-INFO/scripts/get_organelle_from_reads.py", line 4016, in main echo_step=echo_step, log_handler=log_handler) File "/apps/chpc/bio/python/3.7.4_gcc610/lib/python3.7/site-packages/GetOrganelle-1.7.5-py3.7.egg/EGG-INFO/scripts/get_organelle_from_reads.py", line 2474, in extending_no_lim accepted_words.add(this_c_seq[temp_length - i:seq_len - i]) MemoryError