mljar / mljar-supervised

Python package for AutoML on Tabular Data with Feature Engineering, Hyper-Parameters Tuning, Explanations and Automatic Documentation
https://mljar.com
MIT License
3.02k stars 403 forks source link

Can not load saved model #395

Closed xuzhang5788 closed 3 years ago

xuzhang5788 commented 3 years ago

When I reloaded my model to do prediction, I got the following error:


KeyError Traceback (most recent call last) ~/miniconda3/envs/mljar/lib/python3.9/site-packages/supervised/base_automl.py in load(self, path) 184 ): --> 185 ens = Ensemble.load(path, model_subpath, models_map) 186 self._models += [ens]

~/miniconda3/envs/mljar/lib/python3.9/site-packages/supervised/ensemble.py in load(results_path, model_subpath, models_map) 436 ensemble.selected_models += [ --> 437 {"model": models_map[m["model"]], "repeat": m["repeat"]} 438 ]

KeyError: '15_LightGBM'

During handling of the above exception, another exception occurred:

AutoMLException Traceback (most recent call last)

in ----> 1 automl.predict(X_test) ~/miniconda3/envs/mljar/lib/python3.9/site-packages/supervised/automl.py in predict(self, X) 346 AutoMLException: Model has not yet been fitted. 347 """ --> 348 return self._predict(X) 349 350 def predict_proba(self, X): ~/miniconda3/envs/mljar/lib/python3.9/site-packages/supervised/base_automl.py in _predict(self, X) 1298 def _predict(self, X): 1299 -> 1300 predictions = self._base_predict(X) 1301 # Return predictions 1302 # If classification task the result is in column 'label' ~/miniconda3/envs/mljar/lib/python3.9/site-packages/supervised/base_automl.py in _base_predict(self, X, model) 1230 if model is None: 1231 if self._best_model is None: -> 1232 self.load(self.results_path) 1233 model = self._best_model 1234 ~/miniconda3/envs/mljar/lib/python3.9/site-packages/supervised/base_automl.py in load(self, path) 211 212 except Exception as e: --> 213 raise AutoMLException(f"Cannot load AutoML directory. {str(e)}") 214 215 def get_leaderboard( AutoMLException: Cannot load AutoML directory. '15_LightGBM' I refit it, it said This model has already been fitted. You can use predict methods or select a new 'results_path' for a new 'fit()' I used the same method to train 5 models, the other 3 models are okay, two had this error. I used pip install -q -U git+https://github.com/mljar/mljar-supervised.git@dev to reinstall your package. I think there are bugs when you updated LightGBM.
pplonski commented 3 years ago

@xuzhang5788 thanks for reporting. Could you please provide the code and data to reproduce?

xuzhang5788 commented 3 years ago

I don't think you can reproduce this error if I give you code and data. Because I retrained one of the unloadable models again, it is okay now. I didn't change anything, just trained it twice. The first one was crashed, the second one is okay now. I think your package is still unstable now.

pplonski commented 3 years ago

@xuzhang5788 what do you mean by unstable? There are many development on-going in the package. But if you train and then load models with the same package it should work.

BTW, have you checked the latest dev branch? Much more plots, for example models correlation heatmap, and there is an option to set custom eval_metric.

xuzhang5788 commented 3 years ago

Just I knew it is at the development stage, so I do pay attention to the version that I used to train and load carefully. But unfortunately, it still has unpredictable problems.

I haven't tried the new plots yet even though I am using the latest dev branch. I upgraded it because I thought that you have solved the memory overflow problems. Actually, you have fixed it partially. Anyways, thanks for your great efforts. I am looking forward to your new version.

xuzhang5788 commented 3 years ago

Unfortunately, I still got errors even I retrained my second unloadable model.

2021-05-04 09:39:33,666 supervised.exceptions ERROR Cannot load AutoML directory. '14_LightGBM'

AutoMLException: Cannot load AutoML directory. '14_LightGBM'


KeyError Traceback (most recent call last) ~/miniconda3/envs/mljar/lib/python3.9/site-packages/supervised/base_automl.py in load(self, path) 184 ): --> 185 ens = Ensemble.load(path, model_subpath, models_map) 186 self._models += [ens]

~/miniconda3/envs/mljar/lib/python3.9/site-packages/supervised/ensemble.py in load(results_path, model_subpath, models_map) 436 ensemble.selected_models += [ --> 437 {"model": models_map[m["model"]], "repeat": m["repeat"]} 438 ]

KeyError: '14_LightGBM'

During handling of the above exception, another exception occurred:

AutoMLException Traceback (most recent call last)

in 12 # evaluate averaging ensemble (equal weights) 13 weights = [1.0/n_members for _ in range(n_members)] ---> 14 yhats, y_test = calculate_yhat(models, data_input, drug_encoding) 15 yhat = evaluate_ensemble(models, weights, yhats) 16 do_testing(yhat, y_test) in calculate_yhat(models, data_input, drug_encoding) 8 # model.fit(train_X, train_y) 9 print(drug_encoding) ---> 10 yhat = model.predict(X_test) 11 yhats.append(yhat) 12 y_test = y_test ~/miniconda3/envs/mljar/lib/python3.9/site-packages/supervised/automl.py in predict(self, X) 346 AutoMLException: Model has not yet been fitted. 347 """ --> 348 return self._predict(X) 349 350 def predict_proba(self, X): ~/miniconda3/envs/mljar/lib/python3.9/site-packages/supervised/base_automl.py in _predict(self, X) 1298 def _predict(self, X): 1299 -> 1300 predictions = self._base_predict(X) 1301 # Return predictions 1302 # If classification task the result is in column 'label' ~/miniconda3/envs/mljar/lib/python3.9/site-packages/supervised/base_automl.py in _base_predict(self, X, model) 1230 if model is None: 1231 if self._best_model is None: -> 1232 self.load(self.results_path) 1233 model = self._best_model 1234 ~/miniconda3/envs/mljar/lib/python3.9/site-packages/supervised/base_automl.py in load(self, path) 211 212 except Exception as e: --> 213 raise AutoMLException(f"Cannot load AutoML directory. {str(e)}") 214 215 def get_leaderboard( AutoMLException: Cannot load AutoML directory. '14_LightGBM' This happened only at the latest dev version.
pplonski commented 3 years ago

@xuzhang5788 please send me code and data to reproduce this issue.

My email image

BeZie commented 3 years ago

Dear @pplonski ,

I have a similar exception but cannot share data or more details. But from what I understand my ensemble model is looking for 14_XGBoost. But this one is not included in the load_models list: In the params.json, 14_XGBoost is included in the saved list but not in the load_on_predict list, thus it is not loaded.

commenting out

if load_on_predict is not None and self._fit_level == "finished":

            #load_models = load_on_predict

in the base_automl.py file fixes the issue... but is a monkey patch only...

Hope that helps a little bit to track down the problem... Best regards,

oldrichsmejkal commented 3 years ago

Same for me:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
~/anaconda3/envs/mljar/lib/python3.8/site-packages/supervised/base_automl.py in load(self, path)
    190                 ):
--> 191                     ens = Ensemble.load(path, model_subpath, models_map)
    192                     self._models += [ens]

~/anaconda3/envs/mljar/lib/python3.8/site-packages/supervised/ensemble.py in load(results_path, model_subpath, models_map)
    427             ensemble.selected_models += [
--> 428                 {"model": models_map[m["model"]], "repeat": m["repeat"]}
    429             ]

KeyError: '19_LightGBM'

During handling of the above exception, another exception occurred:

AutoMLException                           Traceback (most recent call last)
<ipython-input-141-cefb71db1be2> in <module>
     69 
     70 
---> 71 combined["y_hat"] = pd.Series(gbm.predict(X.copy()), index = X.index)
     72 
     73 

~/anaconda3/envs/mljar/lib/python3.8/site-packages/supervised/automl.py in predict(self, X)
    354             AutoMLException: Model has not yet been fitted.
    355         """
--> 356         return self._predict(X)
    357 
    358     def predict_proba(self, X):

~/anaconda3/envs/mljar/lib/python3.8/site-packages/supervised/base_automl.py in _predict(self, X)
   1321     def _predict(self, X):
   1322 
-> 1323         predictions = self._base_predict(X)
   1324         # Return predictions
   1325         # If classification task the result is in column 'label'

~/anaconda3/envs/mljar/lib/python3.8/site-packages/supervised/base_automl.py in _base_predict(self, X, model)
   1253         if model is None:
   1254             if self._best_model is None:
-> 1255                 self.load(self.results_path)
   1256             model = self._best_model
   1257 

~/anaconda3/envs/mljar/lib/python3.8/site-packages/supervised/base_automl.py in load(self, path)
    217 
    218         except Exception as e:
--> 219             raise AutoMLException(f"Cannot load AutoML directory. {str(e)}")
    220 
    221     def get_leaderboard(

AutoMLException: Cannot load AutoML directory. '19_LightGBM'
oldrichsmejkal commented 3 years ago

Configuration:

from supervised.automl import AutoML

gbm = AutoML(
    results_path = "AutoML_2",
    mode = 'Compete',
    algorithms=[
        #"CatBoost", 
        "Xgboost", "LightGBM",
        "Neural Network", "Linear"
    ],
    total_time_limit = int(4.5*60*60),
    model_time_limit=2*60*60,
    start_random_models=10,
    hill_climbing_steps=5,
    top_models_to_improve=5,
    golden_features=True,
    features_selection=True,
    stack_models=True,
    train_ensemble=True,
    explain_level=1,
    validation_strategy={
    "validation_type": "kfold",
    "k_folds": 3,
    "shuffle": False,
    "stratify": False,
    "random_seed": 123
    },
    n_jobs=14,
    kmeans_features = True,

    #optuna_time_budget = 1 * 60 * 60
)
pplonski commented 3 years ago

@oldrichsmejkal thank you for reporting.

I have a hard time reproducing this issue. Could you please share the data to reproduce the bug?

Do you have model files saved in 19_LightGBM directory?

Small tip, if you use markdown and would like to add code block please use ``` instead of single character (it is used for inline code). Additionaly you can add language after opening characters to have color syntax:

```python python code with color syntax ```

chatchan92 commented 3 years ago

A similar issue report here.

2021-05-28 09:33:54,379 supervised.exceptions ERROR Cannot load AutoML directory. '21_LightGBM'
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
E:\ProgramData\Anaconda3\lib\site-packages\supervised\base_automl.py in load(self, path)
    184                 ):
--> 185                     ens = Ensemble.load(path, model_subpath, models_map)
    186                     self._models += [ens]

E:\ProgramData\Anaconda3\lib\site-packages\supervised\ensemble.py in load(results_path, model_subpath, models_map)
    437             ensemble.selected_models += [
--> 438                 {"model": models_map[m["model"]], "repeat": m["repeat"]}
    439             ]

KeyError: '21_LightGBM'

During handling of the above exception, another exception occurred:

AutoMLException                           Traceback (most recent call last)
<ipython-input-3-bbe3a4d10cf3> in <module>
      5 
      6 # compute the accuracy on test data
----> 7 predictions = automl.predict_all(X)
      8 print(predictions.head())
      9 # print("Test accuracy:", accuracy_score(y_test, predictions["label"].astype(int)))

E:\ProgramData\Anaconda3\lib\site-packages\supervised\automl.py in predict_all(self, X)
    380 
    381         """
--> 382         return self._predict_all(X)
    383 
    384     def score(self, X, y=None, sample_weight=None):

E:\ProgramData\Anaconda3\lib\site-packages\supervised\base_automl.py in _predict_all(self, X)
   1302     def _predict_all(self, X):
   1303         # Make and return predictions
-> 1304         return self._base_predict(X)
   1305 
   1306     def _score(self, X, y=None, sample_weight=None):

E:\ProgramData\Anaconda3\lib\site-packages\supervised\base_automl.py in _base_predict(self, X, model)
   1210         if model is None:
   1211             if self._best_model is None:
-> 1212                 self.load(self.results_path)
   1213             model = self._best_model
   1214 

E:\ProgramData\Anaconda3\lib\site-packages\supervised\base_automl.py in load(self, path)
    211 
    212         except Exception as e:
--> 213             raise AutoMLException(f"Cannot load AutoML directory. {str(e)}")
    214 
    215     def get_leaderboard(AutoMLException: Cannot load AutoML directory. '21_LightGBM'

===================================== I trained the model twice and it still poped up. It seems that this problem happens on LightGBM. Mode= Complete; GoldenFeture=False

pplonski commented 3 years ago

@chatchan92 thank you for reporting. Please send me complete code plus data to reproduce. I would love to fix this issue!

TerryFoster1721 commented 3 years ago

We are trying to train models using this library, and we are encountering the following error when trying to load data:

OSError: SavedModel file does not exist at: F:\School\W3/models/hellp{saved_model.pbtxt|saved_model.pb}

We are unable to share the code as it is for classified work, but looking through model folders there are no files named what it appears it is looking for. Is there a certain way of doing this, or is this a bug in the current version? We are running v0.10.4 i believe (latest as of a week ago)

Edit: We are running a dataset with the following settings:

Mode: Explain Algorithms: auto

These are the only changes from defaults besides locations and time.

Note: the auto is not true auto, since it was revealed this was bugged in my last issue, it is instead loading the other algorithms in a list in place of auto, and we have not had issues with this prior even under strenuous load.

pplonski commented 3 years ago

@TerryFoster1721 thank you for reporting the issue. The bug is not solved yet. It's my nightmare, cant reproduce it.

Your error message looks very strange because MLJAR is not used .pb or .pbtxt file name extensions. Could you paste exact error message? Should be in errors.md file. What code are you using to load the AutoML models?

TerryFoster1721 commented 3 years ago

I believe the mention of those file types are simply the way we are calling it, as this program also uses autokeras and I believe the function for loading was called with the assumption that automl behaved similarly to that library.

I will need to review the code for it, as it is not a file I myself wrote, and see if there is a fix outside of that. I believe our issue is related to something else, though I will provide updates if we discover our error is not our own fault. Apologies.

On a somewhat related note, is the automl library supposed to create one model that is considered the 'best' as determined by the evaluation metric? It would be useful to know which file we are supposed to load, or if we need to find some way to track which model is the best for our use. The only files I see in any folder generated by automl that appear to have anything to do with models are the files for the folds, with no clear file that is supposed to actually represent the model. Is there a proper procedure necessary to save these as a loadable file?

pplonski commented 3 years ago

The MLJAR AutoML select the best model according to the evaluation metric. The model is signed in the leaderboard table which yuo can check in README.md file in the results_path directory. Only the best model is loaded for computing predictions. In the case of Ensemble as the best model, then several models will be loaded (the models that construct ensemble).

The MLJAR doesnt use pb and pbtxt file extensions. They must be from autokeras. The MLJAR AutoML cant read autokeras files.

The load models for computing predictions just create the AutoML obejct with the results_path pointing to the directory with AutoML results (models). And you can compute predictions.

pplonski commented 3 years ago

@oldrichsmejkal @chatchan92 @xuzhang5788 @BeZie @TerryFoster1721 - may I ask you for data and code to reproduce this issue? I would love to fix it.

Please provide data in CSV format. Please provide code examples without external libraries (just numpy, pandas, mljar-supervised packages, so I'm not forced to install 3rd party packages). If I would be able to reproduce this bug on my machine I will fix it. If you don't want or can't share data and code in public, please send me them in the e-mail to piotr - at - mljar.com

Thank you!!!

bridgeso commented 3 years ago

Hi Piotr, I'm encountering the same issue with 3+ of my trained models Thanks

image

pplonski commented 3 years ago

@bridgeso please provide code and data so I can reproduce it locally. Then I will fix it asap.

pplonski commented 3 years ago

@bridgeso I've found the bug! Thank you for your help (in private emails).

The problem is that not all models are loaded. To speed-up the process of loading models from hard-drive I load only necessary models, and looks like I missed some.

I will fix the bug. So the AutoML will be able to use past results and compute predictions.

There is also an option to fix the problem manually. You need to add model name from the error in the params.json file in the "load_on_predict" list and the AutoML models should be loaded.

pplonski commented 3 years ago

The bug is fixed. You can try to load old results with the new code and loading should work.

You can try to install latest changes with:

pip install -q -U git+https://github.com/mljar/mljar-supervised.git@master

The fix will go the the next release 0.11.1.

Thanks @bridgeso for help during debugging!