Open pedrogarciafreitas opened 3 years ago
Hmm, it is strange that such a small dataset caused memory leak and it is also a unusual that autosklearn needed 18Gb. Could you please provide more information to reproduce this issue?
Hi,
Using the following dataset: reference_APSIPA.zip
The following script:
from tpot import TPOTRegressor
import pandas as pd
import numpy as np
df = pd.read_csv('reference_APSIPA.zip')
features = [col for col in df if col.startswith('d_')]
X_train, y_train = df[features].values, df.SCORE.values
tpot = TPOTRegressor(generations=500, population_size=50,
verbosity=2, n_jobs=16, use_dask=True)
tpot.fit(X_train, y_train)
Crashes after 2350 iterations:
$ python tpot_from_dataset.py
Generation 1 - Current best internal CV score: -0.4186788503982938
Generation 2 - Current best internal CV score: -0.4186788503982938
Generation 3 - Current best internal CV score: -0.4186788503982938
Generation 4 - Current best internal CV score: -0.4186788503982938
Generation 5 - Current best internal CV score: -0.4186788503982938
Generation 6 - Current best internal CV score: -0.40041785046702455
Generation 7 - Current best internal CV score: -0.36806708236239455
Generation 8 - Current best internal CV score: -0.360919010590871
Generation 9 - Current best internal CV score: -0.360919010590871
Generation 10 - Current best internal CV score: -0.360919010590871
Generation 11 - Current best internal CV score: -0.360919010590871
Generation 12 - Current best internal CV score: -0.360919010590871
Generation 13 - Current best internal CV score: -0.360919010590871
Generation 14 - Current best internal CV score: -0.360919010590871
Generation 15 - Current best internal CV score: -0.360919010590871
Generation 16 - Current best internal CV score: -0.360919010590871
Generation 17 - Current best internal CV score: -0.360919010590871
Generation 18 - Current best internal CV score: -0.360919010590871
Generation 19 - Current best internal CV score: -0.35142348326471307
Generation 20 - Current best internal CV score: -0.34095032052401014
Generation 21 - Current best internal CV score: -0.34095032052401014
Generation 22 - Current best internal CV score: -0.34095032052401014
Generation 23 - Current best internal CV score: -0.34095032052401014
Generation 24 - Current best internal CV score: -0.34004340266687294
Generation 25 - Current best internal CV score: -0.3299944198542487
Generation 26 - Current best internal CV score: -0.3299944198542487
Generation 27 - Current best internal CV score: -0.3299944198542487
Generation 28 - Current best internal CV score: -0.3299944198542487
Generation 29 - Current best internal CV score: -0.3299944198542487
Generation 30 - Current best internal CV score: -0.3299944198542487
Generation 31 - Current best internal CV score: -0.32862536127972203
Generation 32 - Current best internal CV score: -0.32862536127972203
Generation 33 - Current best internal CV score: -0.32862536127972203
Generation 34 - Current best internal CV score: -0.32862536127972203
Generation 35 - Current best internal CV score: -0.3242168683350396
Generation 36 - Current best internal CV score: -0.3242168683350396
Generation 37 - Current best internal CV score: -0.3242168683350396
Generation 38 - Current best internal CV score: -0.3242168683350396
Generation 39 - Current best internal CV score: -0.3242168683350396
Generation 40 - Current best internal CV score: -0.3242168683350396
Generation 41 - Current best internal CV score: -0.3242168683350396
Generation 42 - Current best internal CV score: -0.3242168683350396
Generation 43 - Current best internal CV score: -0.3242168683350396
Generation 44 - Current best internal CV score: -0.3242168683350396
Generation 45 - Current best internal CV score: -0.3242168683350396
Generation 46 - Current best internal CV score: -0.3242168683350396
Optimization Progress: 9%|█████████▋ | 2350/25050 [5:28:59<27:53:07, 4.42s/pipeline]
Killed
Thanks @pedrogarciafreitas , I'm going to try to replicate the issue.
Would you mind providing your OS and OS version please?
Hi @JDRomano2
I'm using Ubuntu 20.04.1 LTS Kernel 5.4.45-050445-generic #202006070831 and conda 4.9.2 Numpy 1.19.2 Scipy 1.5.2 Pandas 1.2.0 Xgboost 1.3.0 Joblib 1.0.0 Tpot 0.11.7 Sklearn 0.23.2
I can't replicate the issue in MacOS 11, so I'm suspicious that it's on OS or kernel-related memory leak. I'll see if I encounter the same bug in Ubuntu 20.04, like you are using, and report back.
Hi @JDRomano2, any updates here? I am having the same issue on ubuntu 20.04
I have the same Problem and i am on Ubuntu 20.04.3 LTS, too After 10 h Jobrunning - Working as root.
Optimization Progress: 11%|████████████▉ | 336/3000 [9:19:43<94:12:09, 127.30s/pipeline] Killed
Greetz Bjoern
Hi,
I tried to run the following code
but, after few generations, the script was killed by system (literally "killed" is the only message depicted in console).
by running dmesg, I got
The dataset is small (232 rows and 108 columns). A similar program in AutoSklearn consumes about 18GB ram. Is it possible that there is a memory leak?
I'm using Python=3.7.9 and tpot=0.11.7. I tried n_jobs=1, use_dask=False, but the error remains the same.