stephenslab / susieR

R package for "sum of single effects" regression.
https://stephenslab.github.io/susieR
Other
176 stars 46 forks source link

High memory usage in susie, susie_suff_stat and susie_rss #141

Closed pcarbo closed 2 years ago

pcarbo commented 2 years ago

It appears that susie_rss (which calls susie_suff_stat) is using more memory than it needs to, which will limit fine-mapping of large numbers of SNPs.

pcarbo commented 2 years ago

Anecdotally, it looks like we can cut the memory usage in half by skipping the call to is_symmetric_matrix(XtX).

pcarbo commented 2 years ago

The calls to get_purity do also indeed bump the memory usage back up quite a bit. So I think taking care of is_symmetric_matrix of get_purity will help a lot.

pcarbo commented 2 years ago

It also seems that there is the potential to improve the memory usage in susie; I ran susie with a 12,000 x 4,000 matrix X, which is about 0.4 GB in size, whereas the total memory usage by running susie was 2 GB.

pcarbo commented 2 years ago

I've created a branch reduce-memory-usage to make progress on this.

pcarbo commented 2 years ago

As of version 0.11.63 (see the reduce-memory-usage branch) I have made a few improvements to memory usage in the ELBO calculations and in the susie preprocessing steps; this avoids roughly two duplications of the X matrix in memory. So for example for 16,000 x 8,000 matrix X the memory usage goes from 4.7 GB down to 3 GB. A bonus is that the total runtime of running susie also dropped considerably, from 207 s to 59 s. (All tests pass.)