ruanjue / wtdbg2

Redbean: A fuzzy Bruijn graph approach to long noisy reads assembly
GNU General Public License v3.0
513 stars 94 forks source link

Unexpected out of memory error #198

Closed skagawa2 closed 4 years ago

skagawa2 commented 4 years ago

Hi, I am trying to assemble a genome of a fish with a genome size 0.75 Gb (max) with raw data of 160 Gb generated from PacBio sequel II. But I am getting error asking me to allocate more RAM. I have 1 Tb RAM. When I looked into WTDBG2 GitHub I found that for Human CHM1 with size 3Gb (PB x60) with the following parameters: -x rs -g3g -t96 took only 225.1G of RAM. And for Axolotl with 32 Gb genome (which is really huge) at coverage 32x, wtdbg2 requires 1.78Tb RAM. So I think having 1 Tb of RAM should not be the problem. Is this a bug or wtdbg2 actually requires more than 1 Tb of RAM for my job? I tried to reduce the short alignments by increasing the -l upto 10000. I also tried reducing sequencing coverage to 45 (-X 45). Here is error when I tried to use the raw data:

-- 64 cores
-- Starting program: wtdbg2 -x sq -g 700m -i subreads.fq.gz -t 48
-- pid                     
-- date         Fri Mar 27 17:05:27 2020
--
[Fri Mar 27 17:05:27 2020] loading reads
5151581 reads
[Fri Mar 27 17:29:15 2020] filtering from 5151581 reads (>=5000 bp), 125628797952 bp. Try selecting 35000000000 bp
[Fri Mar 27 17:29:19 2020] Done, 437553 reads (>=5000 bp), 35000028672 bp, 136718640 bins
** PROC_STAT(0) **: real 1432.282 sec, user 3023.800 sec, sys 369.710 sec, maxrss 32698008.0 kB, maxvsize 33025348.0 kB
[Fri Mar 27 17:29:19 2020] Set --edge-cov to 3
KEY PARAMETERS: -k 15 -p 0 -K 1000.049988 -A -S 2.000000 -s 0.050000 -g 700000000 -X 50.000000 -e 3 -L 5000
[Fri Mar 27 17:29:19 2020] generating nodes, 48 threads
[Fri Mar 27 17:29:19 2020] indexing bins[(0,136718640)/136718640] (34999971840/124971325184 bp), 48 threads
[Fri Mar 27 17:29:20 2020] - scanning kmers (K15P0S2.00) from 136718640 bins
136718640 bins
********************** Kmer Frequency **********************
   |
   ||
  |||
  ||||
  ||||
  |||||
  ||||||
  ||||||
  |||||||
 |||||||||
 ||||||||||
 |||||||||||
 |||||||||||||
 |||||||||||||||
 ||||||||||||||||||
 |||||||||||||||||||||||
 ||||||||||||||||||||||||||||||
 |||||||||||||||||||||||||||||||||||||||
 |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
**********************     1 - 201    **********************
Quatiles:
   10%   20%   30%   40%   50%   60%   70%   80%   90%   95%
    19    61   176   477  1231  3203  7289 19752 57209 65535
** PROC_STAT(0) **: real 1720.932 sec, user 14174.120 sec, sys 837.200 sec, maxrss 38354888.0 kB, maxvsize 43624396.0 kB
[Fri Mar 27 17:34:08 2020] - high frequency kmer depth is set to 65535
[Fri Mar 27 17:34:08 2020] - Total kmers = 264917873
[Fri Mar 27 17:34:08 2020] - average kmer depth = 55
[Fri Mar 27 17:34:08 2020] - 10122235 low frequency kmers (<2)
[Fri Mar 27 17:34:08 2020] - 0 high frequency kmers (>65535)
[Fri Mar 27 17:34:08 2020] - indexing 254812883 kmers, 14237195250 instances (at most)
136718640 bins
[Fri Mar 27 17:44:05 2020] - indexed  254812883 kmers, 14233134086 instances
[Fri Mar 27 17:44:07 2020] - masked 0 bins as closed
[Fri Mar 27 17:44:07 2020] - sorting
** PROC_STAT(0) **: real 2348.617 sec, user 41551.590 sec, sys 1659.960 sec, maxrss 122269668.0 kB, maxvsize 130155392.0 kB
[Fri Mar 27 17:44:35 2020] Done
0|0 -- Out of memory, try to allocate 34359738368 bytes, old size 17179869184, old addr 0x7e5d002ad010 in encap_list -- mem_share.h:219 --
/usr/local/bin/wtdbg2[0x4028a1]
/usr/local/bin/wtdbg2[0x4384f9]
/usr/local/bin/wtdbg2[0x43b357]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76db)[0x7f56f1f276db]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7f56f1a2988f]
ruanjue commented 4 years ago

No, wtdbg2 try to allolcate 34 GB RAM, but failed.

skagawa2 commented 4 years ago

I have 1 TB of RAM available. Is there something else I am not checking for when looking at available RAM amounts?

ruanjue commented 4 years ago

There is an old issue on this problem, https://github.com/ruanjue/wtdbg2/issues/87, I am not sure whether he/she had solved it, @xiaoyezao