I am using the Pylift module in an AWS-EC2 linux instance with the code below and getting 2 different errors
up = TransformedOutcome(df_fil, col_treatment='Treatment',col_outcome='Outcome',col_policy='prop_scores', stratify=df_fil['Treatment'],sklearn_model = XGBClassifier)param_grid = {#'estimator': XGBClassifier(), 'param_grid': {'max_depth': range(1,8,1) 'learning_rate':[x/100 for x in range(1,12,4)], 'colsample_bytree':[x/10 for x in range(3,10,1)], 'min_child_weight':range(1,6,1), 'scale_pos_weight':[x/10 for x in range(12,18,1)], },'n_jobs' : -1}up.grid_search(**param_grid,cv=2)
Getting the following error while using the above code
`Fitting 2 folds for each of 7 candidates, totalling 14 fits
[Parallel(n_jobs=-1)]: Using backend LokyBackend with 16 concurrent workers.
[Parallel(n_jobs=-1)]: Done 3 out of 14 | elapsed: 1.0min remaining: 3.7min
[Parallel(n_jobs=-1)]: Done 8 out of 14 | elapsed: 1.0min remaining: 45.1s
in
----> 1 up.grid_search(**param_grid,cv=2)
~/anaconda3/lib/python3.7/site-packages/pylift/methods/base.py in grid_search(self, **kwargs)
337 self.grid_search_params.update(kwargs)
338 self.grid_search_ = GridSearchCV(**self.grid_search_params)
--> 339 self.grid_search_.fit(self.x_train, self.transformed_y_train)
340 return self.grid_search_
341
~/anaconda3/lib/python3.7/site-packages/sklearn/model_selection/_search.py in fit(self, X, y, groups, **fit_params)
685 return results
686
--> 687 self._run_search(evaluate_candidates)
688
689 # For multi-metric evaluation, store the best_index_, best_params_ and
~/anaconda3/lib/python3.7/site-packages/sklearn/model_selection/_search.py in _run_search(self, evaluate_candidates)
1146 def _run_search(self, evaluate_candidates):
1147 """Search all candidates in param_grid"""
-> 1148 evaluate_candidates(ParameterGrid(self.param_grid))
1149
1150
~/anaconda3/lib/python3.7/site-packages/sklearn/model_selection/_search.py in evaluate_candidates(candidate_params)
664 for parameters, (train, test)
665 in product(candidate_params,
--> 666 cv.split(X, y, groups)))
667
668 if len(out) < 1:
~/anaconda3/lib/python3.7/site-packages/joblib/parallel.py in __call__(self, iterable)
932
933 with self._backend.retrieval_context():
--> 934 self.retrieve()
935 # Make sure that we get a last message telling us we are done
936 elapsed_time = time.time() - self._start_time
~/anaconda3/lib/python3.7/site-packages/joblib/parallel.py in retrieve(self)
831 try:
832 if getattr(self._backend, 'supports_timeout', False):
--> 833 self._output.extend(job.get(timeout=self.timeout))
834 else:
835 self._output.extend(job.get())
~/anaconda3/lib/python3.7/site-packages/joblib/_parallel_backends.py in wrap_future_result(future, timeout)
519 AsyncResults.get from multiprocessing."""
520 try:
--> 521 return future.result(timeout=timeout)
522 except LokyTimeoutError:
523 raise TimeoutError()
~/anaconda3/lib/python3.7/concurrent/futures/_base.py in result(self, timeout)
430 raise CancelledError()
431 elif self._state == FINISHED:
--> 432 return self.__get_result()
433 else:
434 raise TimeoutError()
~/anaconda3/lib/python3.7/concurrent/futures/_base.py in __get_result(self)
382 def __get_result(self):
383 if self._exception:
--> 384 raise self._exception
385 else:
386 return self._result
TerminatedWorkerError: A worker process managed by the executor was unexpectedly terminated. This could be caused by a segmentation fault while calling the function or by an excessive memory usage causing the Operating System to kill the worker. The exit codes of the workers are {SIGABRT(-6)}`
When I remove n_jobs=-1 from the param_grid i.e with the code below
`param_grid = {#'estimator': XGBClassifier(),
'param_grid': {'max_depth': range(1,8,1)
'learning_rate':[x/100 for x in range(1,12,4)],
'colsample_bytree':[x/10 for x in range(3,10,1)],
'min_child_weight':range(1,6,1),
'scale_pos_weight':[x/10 for x in range(12,18,1)],
}}`
`up.grid_search(**param_grid,cv=2)`
I am getting the following error
`terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc`
I am using Jupyter notebook to run the above mentioned code, I know it can't be a memory error, cause I am having ample of memory i.e. 64 GB with 8 cores and using Python3.7 anaconda distribution
I am using the Pylift module in an AWS-EC2 linux instance with the code below and getting 2 different errors
up = TransformedOutcome(df_fil, col_treatment='Treatment',col_outcome='Outcome',col_policy='prop_scores', stratify=df_fil['Treatment'],sklearn_model = XGBClassifier)
param_grid = {#'estimator': XGBClassifier(), 'param_grid': {'max_depth': range(1,8,1) 'learning_rate':[x/100 for x in range(1,12,4)], 'colsample_bytree':[x/10 for x in range(3,10,1)], 'min_child_weight':range(1,6,1), 'scale_pos_weight':[x/10 for x in range(12,18,1)], },'n_jobs' : -1}
up.grid_search(**param_grid,cv=2)
Getting the following error while using the above code
`Fitting 2 folds for each of 7 candidates, totalling 14 fits [Parallel(n_jobs=-1)]: Using backend LokyBackend with 16 concurrent workers. [Parallel(n_jobs=-1)]: Done 3 out of 14 | elapsed: 1.0min remaining: 3.7min [Parallel(n_jobs=-1)]: Done 8 out of 14 | elapsed: 1.0min remaining: 45.1s
TerminatedWorkerError Traceback (most recent call last)