LohseLab / gIMble

A genome-wide IM blockwise likelihood estimation toolkit
GNU General Public License v3.0
14 stars 4 forks source link

too much memory needed during gimble windows #131

Closed XieHongX closed 7 months ago

XieHongX commented 7 months ago

Hi,

I am analysing a data set of 18+17 sample size and ~1.5 GB genome. I used block length 145bp because the genetic diversity of my species is lower. Then I want to use a window size of 500 as in your original paper, so run gimble windows with -w 150000 -s 30000. it turns out giving error message because of out of memory usage (there is 512 GB of memory in total!). Shall I increase the window size or decrease it in order to reduce memory usage?

Alternatively, is there a way to reduce memory usage while keeping the desired sample size?

Best, Hongxin

XieHongX commented 7 months ago

The error message is: ... gimble.py", line 1014, in tally_variation mutuples = np.concatenate( ^^^^^^^^^^^^^^^ numpy.core._exceptions._ArrayMemoryError: Unable to allocate 188. GiB for an array with shape (5038150000, 5) and data type int64