bigomics / playbase

Core back-end functionality and logic for OmicsPlayground
Other
3 stars 0 forks source link

Reduce RAM footprint #117

Closed ESCRI11 closed 3 months ago

ESCRI11 commented 3 months ago

This PR is aimed at reducing the RAM footprint of PGX computation. Basically it boils down to two changes:

  1. Use of matrixStats for rank computation. Using apply was both slower and memory hungry. Nevertheless, this updated operation still causes a peak of RAM usage. Can be further reduced by ranking the matrix by chunks.
  2. Assert further control on the multi-threading functionalities of fgsea and GSVA. Both this functions can take up large chunks of memory when used on a multi-threaded configuration (crashing the compute workers), for that reason, I have set them to 1 thread on most scenarios except pgx.correlateSignatureH5, where 1 thread tragically affects performance, keeping it at 2 threads provides decent enough performance and much more safe RAM usage.

These upgrades do not affect the results. Please @mauromiguelm double check on your end.

mauromiguelm commented 3 months ago

Hi @ESCRI11, which methods are affected and should be tested?

image
ESCRI11 commented 3 months ago

@mauromiguelm Given the changed files: 1) enrichment gsva, 2) extra drugs connectivity and 3) extra experiment similarity