cdt15 / lingam

Python package for causal discovery based on LiNGAM.
https://sites.google.com/view/sshimizu06/lingam
MIT License
356 stars 54 forks source link

LiM uses too much memory. #115

Open EsqYu opened 7 months ago

EsqYu commented 7 months ago

I tried to run the LiM code, but because of lack of memory, it didn't work. I set n_features as 10 and it used more than 120GB. Does this happen normally, or did I do something inappropriate? Are there any limitations about the number of variables?

YanaZeng commented 7 months ago

Hi, EsqYu. Thanks for pointing out this issue.

When there are 10 features, it may use indeed much memory. It is due to the local search procedure that searches over the skeleton space. That is, the code would choose to reverse the causal direction or not, for every specific edge. If there are d estimated edges, there will be 2^d graphs to be evaluated.

To avoid this problem, we could use global-only cases and not use this local search, setting only_global=True.
Or, maybe we could set a higher w_threshold (e.g., 0.3 or 0.5, other than the default 0.1) to rule out those estimated edges whose effects are lower than w_threshold.

We will take time to rewrite our code to alleviate this problem. Sorry for this inconvenience. If any other problems, please feel free to tell us.