PatWalters commented 1 year ago

This looks like another case where the code is looking for a GPU

Lots of output like this

[CV] END colsample_bytree=1.0, learning_rate=0.01, max_depth=7, min_child_weight=3, n_estimators=200, reg_alpha=20, reg_lambda=0, subsample=0.7; total time= 0.0s

ValueError Traceback (most recent call last) Cell In [3], line 5 1 # This will now use our model manager to test our top models 2 # and will take around 1-4 hours to run in total for this dataset, 3 # though it will take more or less time depending on the machine. ----> 5 mm.run(models)

File ~/opt/anaconda3/envs/oce/lib/python3.8/site-packages/olorenchemengine/manager.py:104, in BaseModelManager.run(self, models, return_models) 102 print("Fitting model") 103 start = timer() --> 104 model.fit(*self.dataset.train_dataset) 105 end = timer() 106 print("Evaluating model")

File ~/opt/anaconda3/envs/oce/lib/python3.8/site-packages/olorenchemengine/base_class.py:466, in BaseModel.fit(self, X_train, y_train, valid, error_model) 463 if not valid is None: 464 X_valid = self.preprocess(X_valid, y_valid, fit=False) --> 466 self._fit(X_train, y_train) 468 # Calibrate model 469 if not valid is None and self.setting == "classification":

File ~/opt/anaconda3/envs/oce/lib/python3.8/site-packages/olorenchemengine/base_class.py:995, in BaseSKLearnModel._fit(self, X_train, y_train) 993 else: 994 self.setting = "regression" --> 995 self.regression_model.fit(X_train, y_train) 996 self.model = self.regression_model

File ~/opt/anaconda3/envs/oce/lib/python3.8/site-packages/olorenchemengine/basics.py:49, in RandomizedSearchCVModel.fit(self, *args, kwargs) 48 def fit(self, *args, *kwargs): ---> 49 super().fit(args, kwargs) 50 self.obj = self.bestestimator

File ~/opt/anaconda3/envs/oce/lib/python3.8/site-packages/sklearn/model_selection/_search.py:875, in BaseSearchCV.fit(self, X, y, groups, **fit_params) 869 results = self._format_results( 870 all_candidate_params, n_splits, all_out, all_more_results 871 ) 873 return results --> 875 self._run_search(evaluate_candidates) 877 # multimetric is determined here because in the case of a callable 878 # self.scoring the return type is only known after calling 879 first_test_score = all_out[0]["test_scores"]

File ~/opt/anaconda3/envs/oce/lib/python3.8/site-packages/sklearn/model_selection/_search.py:1753, in RandomizedSearchCV._run_search(self, evaluate_candidates) 1751 def _run_search(self, evaluate_candidates): 1752 """Search n_iter candidates from param_distributions""" -> 1753 evaluate_candidates( 1754 ParameterSampler( 1755 self.param_distributions, self.n_iter, random_state=self.random_state 1756 ) 1757 )

File ~/opt/anaconda3/envs/oce/lib/python3.8/site-packages/sklearn/model_selection/_search.py:852, in BaseSearchCV.fit..evaluate_candidates(candidate_params, cv, more_results) 845 elif len(out) != n_candidates * n_splits: 846 raise ValueError( 847 "cv.split and cv.get_n_splits returned " 848 "inconsistent results. Expected {} " 849 "splits, got {}".format(n_splits, len(out) // n_candidates) 850 ) --> 852 _warn_or_raise_about_fit_failures(out, self.error_score) 854 # For callable self.scoring, the return type is only know after 855 # calling. If the return type is a dictionary, the error scores 856 # can now be inserted with the correct key. The type checking 857 # of out will be done in _insert_error_scores. 858 if callable(self.scoring):

File ~/opt/anaconda3/envs/oce/lib/python3.8/site-packages/sklearn/model_selection/_validation.py:367, in _warn_or_raise_about_fit_failures(results, error_score) 360 if num_failed_fits == num_fits: 361 all_fits_failed_message = ( 362 f"\nAll the {num_fits} fits failed.\n" 363 "It is very likely that your model is misconfigured.\n" 364 "You can try to debug the error by setting error_score='raise'.\n\n" 365 f"Below are more details about the failures:\n{fit_errors_summary}" 366 ) --> 367 raise ValueError(all_fits_failed_message) 369 else: 370 some_fits_failed_message = ( 371 f"\n{num_failed_fits} fits failed out of a total of {num_fits}.\n" 372 "The score on these train-test partitions for these parameters" (...) 376 f"Below are more details about the failures:\n{fit_errors_summary}" 377 )

ValueError: All the 500 fits failed. It is very likely that your model is misconfigured. You can try to debug the error by setting error_score='raise'.

Below are more details about the failures:

72 fits failed with the following error: Traceback (most recent call last): File "/Users/pwalters/opt/anaconda3/envs/oce/lib/python3.8/site-packages/sklearn/model_selection/_validation.py", line 686, in _fit_and_score estimator.fit(X_train, y_train, fit_params) File "/Users/pwalters/opt/anaconda3/envs/oce/lib/python3.8/site-packages/xgboost/core.py", line 575, in inner_f return f(kwargs) File "/Users/pwalters/opt/anaconda3/envs/oce/lib/python3.8/site-packages/xgboost/sklearn.py", line 961, in fit self._Booster = train( File "/Users/pwalters/opt/anaconda3/envs/oce/lib/python3.8/site-packages/xgboost/core.py", line 575, in inner_f return f(**kwargs) File "/Users/pwalters/opt/anaconda3/envs/oce/lib/python3.8/site-packages/xgboost/training.py", line 181, in train bst.update(dtrain, i, obj) File "/Users/pwalters/opt/anaconda3/envs/oce/lib/python3.8/site-packages/xgboost/core.py", line 1778, in update _check_call(_LIB.XGBoosterUpdateOneIter(self.handle, File "/Users/pwalters/opt/anaconda3/envs/oce/lib/python3.8/site-packages/xgboost/core.py", line 246, in _check_call raise XGBoostError(py_str(_LIB.XGBGetLastError())) xgboost.core.XGBoostError: [12:55:15] /Users/runner/work/xgboost/xgboost/python-package/build/temp.macosx-10.9-x86_64-cpython-37/xgboost/src/gbm/../common/common.h:239: XGBoost version not compiled with GPU support. Stack trace: [bt] (0) 1 libxgboost.dylib 0x0000000153136705 dmlc::LogMessageFatal::~LogMessageFatal() + 117 [bt] (1) 2 libxgboost.dylib 0x00000001531eb550 xgboost::gbm::GBTree::ConfigureUpdaters() + 512 [bt] (2) 3 libxgboost.dylib 0x00000001531eb0e7 xgboost::gbm::GBTree::Configure(std::1::vector<std::1::pair<std::1::basic_string<char, std::__1::char_traits, std::1::allocator >, std::1::basic_string<char, std::__1::char_traits, std::1::allocator > >, std::1::allocator<std::1::pair<std::1::basic_string<char, std::__1::char_traits, std::1::allocator >, std::1::basic_string<char, std::__1::char_traits, std::1::allocator > > > > const&) + 1143 [bt] (3) 4 libxgboost.dylib 0x000000015320a1e9 xgboost::LearnerConfiguration::Configure() + 1177 [bt] (4) 5 libxgboost.dylib 0x000000015320a4d9 xgboost::LearnerImpl::UpdateOneIter(int, std::__1::shared_ptr) + 105 [bt] (5) 6 libxgboost.dylib 0x000000015313a54a XGBoosterUpdateOneIter + 138 [bt] (6) 7 libffi.7.dylib 0x00000001093e4ead ffi_call_unix64 + 85 [bt] (7) 8 ??? 0x0000000301b70220 0x0 + 12913672736

236 fits failed with the following error: Traceback (most recent call last): File "/Users/pwalters/opt/anaconda3/envs/oce/lib/python3.8/site-packages/sklearn/model_selection/_validation.py", line 686, in _fit_and_score estimator.fit(X_train, y_train, fit_params) File "/Users/pwalters/opt/anaconda3/envs/oce/lib/python3.8/site-packages/xgboost/core.py", line 575, in inner_f return f(kwargs) File "/Users/pwalters/opt/anaconda3/envs/oce/lib/python3.8/site-packages/xgboost/sklearn.py", line 961, in fit self._Booster = train( File "/Users/pwalters/opt/anaconda3/envs/oce/lib/python3.8/site-packages/xgboost/core.py", line 575, in inner_f return f(**kwargs) File "/Users/pwalters/opt/anaconda3/envs/oce/lib/python3.8/site-packages/xgboost/training.py", line 181, in train bst.update(dtrain, i, obj) File "/Users/pwalters/opt/anaconda3/envs/oce/lib/python3.8/site-packages/xgboost/core.py", line 1778, in update _check_call(_LIB.XGBoosterUpdateOneIter(self.handle, File "/Users/pwalters/opt/anaconda3/envs/oce/lib/python3.8/site-packages/xgboost/core.py", line 246, in _check_call raise XGBoostError(py_str(_LIB.XGBGetLastError())) xgboost.core.XGBoostError: [12:55:16] /Users/runner/work/xgboost/xgboost/python-package/build/temp.macosx-10.9-x86_64-cpython-37/xgboost/src/gbm/../common/common.h:239: XGBoost version not compiled with GPU support. Stack trace: [bt] (0) 1 libxgboost.dylib 0x0000000153136705 dmlc::LogMessageFatal::~LogMessageFatal() + 117 [bt] (1) 2 libxgboost.dylib 0x00000001531eb550 xgboost::gbm::GBTree::ConfigureUpdaters() + 512 [bt] (2) 3 libxgboost.dylib 0x00000001531eb0e7 xgboost::gbm::GBTree::Configure(std::1::vector<std::1::pair<std::1::basic_string<char, std::__1::char_traits, std::1::allocator >, std::1::basic_string<char, std::__1::char_traits, std::1::allocator > >, std::1::allocator<std::1::pair<std::1::basic_string<char, std::__1::char_traits, std::1::allocator >, std::1::basic_string<char, std::__1::char_traits, std::1::allocator > > > > const&) + 1143 [bt] (3) 4 libxgboost.dylib 0x000000015320a1e9 xgboost::LearnerConfiguration::Configure() + 1177 [bt] (4) 5 libxgboost.dylib 0x000000015320a4d9 xgboost::LearnerImpl::UpdateOneIter(int, std::__1::shared_ptr) + 105 [bt] (5) 6 libxgboost.dylib 0x000000015313a54a XGBoosterUpdateOneIter + 138 [bt] (6) 7 libffi.7.dylib 0x00000001093e4ead ffi_call_unix64 + 85 [bt] (7) 8 ??? 0x0000000301b70220 0x0 + 12913672736

192 fits failed with the following error: Traceback (most recent call last): File "/Users/pwalters/opt/anaconda3/envs/oce/lib/python3.8/site-packages/sklearn/model_selection/_validation.py", line 686, in _fit_and_score estimator.fit(X_train, y_train, fit_params) File "/Users/pwalters/opt/anaconda3/envs/oce/lib/python3.8/site-packages/xgboost/core.py", line 575, in inner_f return f(kwargs) File "/Users/pwalters/opt/anaconda3/envs/oce/lib/python3.8/site-packages/xgboost/sklearn.py", line 961, in fit self._Booster = train( File "/Users/pwalters/opt/anaconda3/envs/oce/lib/python3.8/site-packages/xgboost/core.py", line 575, in inner_f return f(**kwargs) File "/Users/pwalters/opt/anaconda3/envs/oce/lib/python3.8/site-packages/xgboost/training.py", line 181, in train bst.update(dtrain, i, obj) File "/Users/pwalters/opt/anaconda3/envs/oce/lib/python3.8/site-packages/xgboost/core.py", line 1778, in update _check_call(_LIB.XGBoosterUpdateOneIter(self.handle, File "/Users/pwalters/opt/anaconda3/envs/oce/lib/python3.8/site-packages/xgboost/core.py", line 246, in _check_call raise XGBoostError(py_str(_LIB.XGBGetLastError())) xgboost.core.XGBoostError: [12:55:17] /Users/runner/work/xgboost/xgboost/python-package/build/temp.macosx-10.9-x86_64-cpython-37/xgboost/src/gbm/../common/common.h:239: XGBoost version not compiled with GPU support. Stack trace: [bt] (0) 1 libxgboost.dylib 0x0000000153136705 dmlc::LogMessageFatal::~LogMessageFatal() + 117 [bt] (1) 2 libxgboost.dylib 0x00000001531eb550 xgboost::gbm::GBTree::ConfigureUpdaters() + 512 [bt] (2) 3 libxgboost.dylib 0x00000001531eb0e7 xgboost::gbm::GBTree::Configure(std::1::vector<std::1::pair<std::1::basic_string<char, std::__1::char_traits, std::1::allocator >, std::1::basic_string<char, std::__1::char_traits, std::1::allocator > >, std::1::allocator<std::1::pair<std::1::basic_string<char, std::__1::char_traits, std::1::allocator >, std::1::basic_string<char, std::__1::char_traits, std::1::allocator > > > > const&) + 1143 [bt] (3) 4 libxgboost.dylib 0x000000015320a1e9 xgboost::LearnerConfiguration::Configure() + 1177 [bt] (4) 5 libxgboost.dylib 0x000000015320a4d9 xgboost::LearnerImpl::UpdateOneIter(int, std::__1::shared_ptr) + 105 [bt] (5) 6 libxgboost.dylib 0x000000015313a54a XGBoosterUpdateOneIter + 138 [bt] (6) 7 libffi.7.dylib 0x00000001093e4ead ffi_call_unix64 + 85 [bt] (7) 8 ??? 0x0000000301b70220 0x0 + 12913672736

Get the list of models and sort by their RMSE performance metrics

We see that the best model now outperforms the published models.

raunakdoesdev commented 1 year ago

Thanks for catching this. I just posted a fix for this here: be0ae4455e34e1e600ec424596baa86a6a19a07b

It was indeed that the XGBoost models ignore the cuda availability check and use GPU by default. We'll push these to PyPI soon, but for a fix in the meantime - you can run: pip install --upgrade "olorenchemengine[full] @ git+https://github.com/Oloren-AI/olorenchemengine.git"

to upgrade to the latest version from master.

We are implementing a better CI system that should test OCE on many different environments/platforms very soon - which should help catch these issues earlier going forward.

PatWalters commented 1 year ago

Thanks, that worked. Note that I did have to pip uninstall xgboost pip install xgboost to get xgboost to work.

davidzqhuang commented 1 year ago

Thanks, will add that note to the installation guide

Oloren-AI / olorenchemengine

Error running 1B_Model_Searching.ipynb on M1 Mac #50

Below are more details about the failures:

Get the list of models and sort by their RMSE performance metrics

We see that the best model now outperforms the published models.