lissettegomez / coMethDMR

3 stars 3 forks source link

A question about RAM needed for running the CoMethAllRegions() function #8

Open QianhuiWan opened 3 years ago

QianhuiWan commented 3 years ago

Hello, I have tested the CoMethAllRegions() function with a small subset of the EPIC data we have, and I assume this function will need a larger ram for processing all EPIC array data. Is it possible to process EPIC array data (gene-related probes) using a machine with 8 cores and 16GB of RAM? Thank you.

gabrielodom commented 3 years ago

Hi Qianhui, Thank you for taking an interest in our package. I believe it may be possible, but with some considerations:

  1. The work should be done using serial computing. If you are worried about memory overflow issues, parallel computing would probably not work because it requires additional memory to communicate between master and worker nodes.
  2. The memory use for EPIC arrays (and 450k for that matter) is entirely dependent on your sample size. How many samples do you have?
  3. It may be worthwhile to break the data up by chromosome and run CoMethAllRegions() independently for each chromosome.

I hope to hear back from you soon :) Cheers, Gabriel

QianhuiWan commented 3 years ago

Hi Gabriel,

Thank you so much for your prompt reply and suggestions. We have EPIC array data from 131 samples, I think I will run by chromosome.

Best regards Qianhui

gabrielodom commented 3 years ago

Try for one of the smaller chromosomes first, and record how long it takes to complete. Please let us know how it goes

QianhuiWan commented 3 years ago

Hi Gabriel,

I have used 6 cores and 16GB memory for the gene-related probes (EPIC array) on chr22, I cannot get the result from the GetResiduals() function because the 16GB memory is not enough, I probably need to use HPC to process these data, thank you.

Best regards Qianhui