scibiome / boostdiff_inference

BoostDiff - Inference of differential networks from gene expression data
GNU General Public License v3.0
11 stars 1 forks source link

MemoryError #4

Open MusculusMus opened 1 year ago

MusculusMus commented 1 year ago

Hi,

When I test the demo data, everything look good. But when I run my own data with ~18000 genes and 4 samples under each condition, I got MemoryError after 296 minutes of running.

Here are details about my windows laptop: Processor Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz 2.21 GHz Installed RAM 32.0 GB (31.7 GB usable) System type 64-bit operating system, x64-based processor

The demo data contains only hundreds of genes, so have you tried whole transcriptome data? How can I avoid this kind of error? If 32 GB memory is not large enough, what's the right memory size for my study?

Shuai

MusculusMus commented 1 year ago

I solved this memory error by filtering the counts, a data frame includes ~6000 genes and 12 samples per condition take ~40 minutes. But the output files are all empty, I follow the default setting and I am not sure what the right setting for my data set is:

n_estimators = 100 n_features = 50 n_subsamples = 50

Another explanation for empty output files is the gene names belong to the mouse, is it okay to run the mouse gene names using your boostdiff_inference module?

Shuai

gihannagalindez commented 1 year ago

Hi Shuai,

Apologies for the late reply. Regarding the dataset, mouse data (or any gene expression dataset) can be used and should work, as long as the gene names/features are under a column named "Gene." Regarding the memory usage, I would recommend using a cluster for real large datasets from human or mouse. However, considering that your dataset only has 12 samples per condition, I would not expect BoostDiff to perform very well with few samples. Ideally, there would be at least 30 samples per condition because of how the method works. Nevertheless, I would be happy to look at a small sample dataset from you and find out why the output is empty in that case.

Best, Gihanna