CompEpigen / methrix

An R :package: for fast and flexible DNA methylation analysis
https://www.bioconductor.org/packages/release/bioc/html/methrix.html
Other
28 stars 11 forks source link

Using get_region_summary with very large methrix object #30

Closed richardheery closed 1 month ago

richardheery commented 2 years ago

Hi, I am attempting to use get_region_summary with a very large methrix object (almost 400 human samples) and a GRanges object with about 80,000 regions and am finding that it is constantly using all the RAM on my machine before being killed. Even on a workstation with almost 200 GB RAM, the problem persists. I guess by using the n_chunks parameter I can reduce the RAM being used, but I am wondering is there a rough guide for how many chunks to use depending on the number of samples and number of regions being evaluated?

tkik commented 1 year ago

I am sorry, I somehow missed this issue. Are you using the h5 (DelayedArray)? If you are keeping the object in the memory, it can easily fill that much. Also, please avoid using multiple cores if memory is an issue, that might help too. There is no general rule for the number of chunks, but it should be more than the number of cores used. With 80000 regions you can go up to hundreds, or more. Please let me know if you managed to solve it, I can look into it in more detail if needed.

tkik commented 1 month ago

Closing due to inactivity.