privefl / bigstatsr

R package for statistical tools with big matrices stored on disk.
https://privefl.github.io/bigstatsr/
179 stars 30 forks source link

Question about memory #189

Open wbooker opened 2 months ago

wbooker commented 2 months ago

I was wondering if you could give me a sense if the memory use of a big_spLinReg I am running seems to be appropriate, and if not if there was a way to reduce the memory requirements. My dataset is 300 individuals x 23626816 sites stored as a double FBM, and in order to run efficiently with many cores I need ~600Gb of memory. Does this seem correct to you? Just wondering if I am doing something wrong or if there are ways to reduce memory usage without sacrificing efficiency here.

Thanks!

privefl commented 2 months ago

If my calculations are correct, the backingfile should take 53 GB on disk. So, you should not need much more than that. And using less memory should still be fine since the model should work on a subset of the data most of the time.

privefl commented 1 day ago

Any update on this?