py-why / EconML

ALICE (Automated Learning and Intelligence for Causation and Economics) is a Microsoft Research project aimed at applying Artificial Intelligence concepts to economic decision making. One of its goals is to build a toolkit that combines state-of-the-art machine learning techniques with econometrics in order to bring automation to complex causal inference problems. To date, the ALICE Python SDK (econml) implements orthogonal machine learning algorithms such as the double machine learning work of Chernozhukov et al. This toolkit is designed to measure the causal effect of some treatment variable(s) t on an outcome variable y, controlling for a set of features x.
https://www.microsoft.com/en-us/research/project/alice/
Other
3.82k stars 714 forks source link

cannot allocate memory with CausalAnalysis().fit() #707

Open krishpn opened 1 year ago

krishpn commented 1 year ago
OS Architecture:

Operating System: Ubuntu 20.04.4 LTS Kernel: Linux 5.15.0-53-generic Architecture: x86-64


With a DataFrame of size `(rows, columns) = (9000, 102)`, the `fit` method I often get the cannot allocate memory

from econml.solutions.causal_analysis import CausalAnalysis import warnings warnings.simplefilter(action='ignore') ca = CausalAnalysis(feature_inds=top_features, categorical=categorical, heterogeneity_inds=None, classification=True, nuisance_models="automl", heterogeneity_model="forest", n_jobs=-1, random_state=1234)

ca.fit(X, y)


As a temporary solutuion and some research I have managed a bandaid solution with following linux commands but did not solve the issue

swapon --show sudo swapoff /swapfile sudo fallocate -l 5G /swapfile ls -lh /swapfile



Initially I had `1G `allocated. With the above syntax, now the swap file has `5GB`  which did not solve the issue

My issue is if the fit function fails with small dataset with 9000 rows while plenty of the unallocated memory available (I have ram with 64 GB, unused memory about 20GB), has there been similar issue reported? 

Some references:
https://stackoverflow.com/questions/5306075/python-memory-allocation-error-using-subprocess-popen
https://stackoverflow.com/questions/20111242/how-to-avoid-errno-12-cannot-allocate-memory-errors-caused-by-using-subprocess
https://stackoverflow.com/questions/20111242/how-to-avoid-errno-12-cannot-allocate-memory-errors-caused-by-using-subprocess
krishpn commented 1 year ago

Full stack trace

File "/home/sambashare/newertext/thesis/docs/code/000_pyGraph/gb_rf_xgb.py", line 123, in <module>
    xgn.ift_xgb_hpSearch(df, year, model_path=model_path, shap_plot_path=shap_plot_path,
  File "/home/sambashare/newertext/thesis/docs/code/000_pyGraph/xgb_nov_tg.py", line 986, in ift_xgb_hpSearch
    causalAnl(shap_values=contributions, train=train, test=test, title= rfhpSearch+'_Best_Model',
  File "/home/sambashare/newertext/thesis/docs/code/000_pyGraph/xgb_nov_tg.py", line 214, in causalAnl
    ca.fit(X, y)
  File "/home/miniconda3/envs/tf-gpu-mem-day/lib/python3.10/site-packages/econml/solutions/causal_analysis/_causal_analysis.py", line 867, in fit
    joblib.Parallel(
  File "/home/miniconda3/envs/tf-gpu-mem-day/lib/python3.10/site-packages/joblib/parallel.py", line 1043, in __call__
    if self.dispatch_one_batch(iterator):
  File "/home/miniconda3/envs/tf-gpu-mem-day/lib/python3.10/site-packages/joblib/parallel.py", line 861, in dispatch_one_batch
    self._dispatch(tasks)
  File "/home/miniconda3/envs/tf-gpu-mem-day/lib/python3.10/site-packages/joblib/parallel.py", line 779, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/home/miniconda3/envs/tf-gpu-mem-day/lib/python3.10/site-packages/joblib/_parallel_backends.py", line 531, in apply_async
    future = self._workers.submit(SafeFunction(func))
  File "/home/miniconda3/envs/tf-gpu-mem-day/lib/python3.10/site-packages/joblib/externals/loky/reusable_executor.py", line 177, in submit
    return super(_ReusablePoolExecutor, self).submit(
  File "/home/miniconda3/envs/tf-gpu-mem-day/lib/python3.10/site-packages/joblib/externals/loky/process_executor.py", line 1135, in submit
    self._ensure_executor_running()
  File "/home/miniconda3/envs/tf-gpu-mem-day/lib/python3.10/site-packages/joblib/externals/loky/process_executor.py", line 1109, in _ensure_executor_running
    self._adjust_process_count()
  File "/home/miniconda3/envs/tf-gpu-mem-day/lib/python3.10/site-packages/joblib/externals/loky/process_executor.py", line 1100, in _adjust_process_count
    p.start()
  File "/home/miniconda3/envs/tf-gpu-mem-day/lib/python3.10/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
  File "/home/miniconda3/envs/tf-gpu-mem-day/lib/python3.10/site-packages/joblib/externals/loky/backend/process.py", line 39, in _Popen
    return Popen(process_obj)
  File "/home/miniconda3/envs/tf-gpu-mem-day/lib/python3.10/site-packages/joblib/externals/loky/backend/popen_loky_posix.py", line 52, in __init__
    self._launch(process_obj)
  File "/home/miniconda3/envs/tf-gpu-mem-day/lib/python3.10/site-packages/joblib/externals/loky/backend/popen_loky_posix.py", line 157, in _launch
    pid = fork_exec(cmd_python, self._fds, env=process_obj.env)
  File "/home/miniconda3/envs/tf-gpu-mem-day/lib/python3.10/site-packages/joblib/externals/loky/backend/fork_exec.py", line 43, in fork_exec
    pid = os.fork()
OSError: [Errno 12] Cannot allocate memory

causalAnl is the

CausalAnalysis(feature_inds=top_features, categorical=categorical,
heterogeneity_inds=None,
classification=True,
nuisance_models="automl",
heterogeneity_model="forest",
n_jobs=-1,
random_state=1234)
kbattocchi commented 1 year ago

Thanks, we'll take a look. How many entries are in your top_features?

krishpn commented 1 year ago

I have tested with 5, 10, 15 and 10.

best Krishna

On Tue, Dec 13, 2022 at 10:31 AM Keith Battocchi @.***> wrote:

Thanks, we'll take a look. How many entries are in your top_features?

— Reply to this email directly, view it on GitHub https://github.com/microsoft/EconML/issues/707#issuecomment-1348810375, or unsubscribe https://github.com/notifications/unsubscribe-auth/APBUYJUFOCKKCR6CYS6F6DTWNCJFVANCNFSM6AAAAAASPDVIFM . You are receiving this because you authored the thread.Message ID: @.***>