ensemble.json not found error when training in Compete mode with total_time_limit

Karlheinzniebuhr commented 1 year ago

After training Compete mode, I'm getting this error when trying to load the model

automl = AutoML(mode='Compete', results_path=model_path, total_time_limit=24*3600, eval_metric=sign_penalty)
automl_trained = AutoML(results_path=model_path)
automl_predictions = automl_trained.predict(X_test)

FileNotFoundError                         Traceback (most recent call last)
File c:\ProgramData\Anaconda3\lib\site-packages\supervised\base_automl.py:199, in BaseAutoML.load(self, path)
    196 if model_subpath.endswith("Ensemble") or model_subpath.endswith(
    197     "Ensemble_Stacked"
    198 ):
--> 199     ens = Ensemble.load(path, model_subpath, models_map)
    200     self._models += [ens]

File c:\ProgramData\Anaconda3\lib\site-packages\supervised\ensemble.py:435, in Ensemble.load(results_path, model_subpath, models_map)
    433 logger.info(f"Loading ensemble from {model_path}")
--> 435 json_desc = json.load(open(os.path.join(model_path, "ensemble.json")))
    437 ensemble = Ensemble(json_desc.get("optimize_metric"), json_desc.get("ml_task"))

FileNotFoundError: [Errno 2] No such file or directory: 'trained_models/Compete_%_change_close_BTCUSDT_spot_15m_custom_loss+2h\\Ensemble\\ensemble.json'

During handling of the above exception, another exception occurred:

AutoMLException                           Traceback (most recent call last)
c:\dev\Python\Mastermind\mastermind\training\LAB_MLJAR_custom_loss.ipynb Cell 15 in <cell line: 2>()
      [1](vscode-notebook-cell:/c%3A/dev/Python/Mastermind/mastermind/training/LAB_MLJAR_custom_loss.ipynb#X20sZmlsZQ%3D%3D?line=0) automl_trained = AutoML(results_path=model_path)
----> [2](vscode-notebook-cell:/c%3A/dev/Python/Mastermind/mastermind/training/LAB_MLJAR_custom_loss.ipynb#X20sZmlsZQ%3D%3D?line=1) automl_predictions = automl_trained.predict(X_test)
      [3](vscode-notebook-cell:/c%3A/dev/Python/Mastermind/mastermind/training/LAB_MLJAR_custom_loss.ipynb#X20sZmlsZQ%3D%3D?line=2) pd.Series(automl_predictions).describe()

File c:\ProgramData\Anaconda3\lib\site-packages\supervised\automl.py:387, in AutoML.predict(self, X)
...
    223         self.n_classes = self._data_info["n_classes"]
    225 except Exception as e:
--> 226     raise AutoMLException(f"Cannot load AutoML directory. {str(e)}")

AutoMLException: Cannot load AutoML directory. [Errno 2] No such file or directory: 'trained_models/Compete_%_change_close_BTCUSDT_spot_15m_custom_loss+2h\\Ensemble\\ensemble.json'

The errors.md file:

## Error for Ensemble

The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
Traceback (most recent call last):
  File "c:\ProgramData\Anaconda3\lib\site-packages\supervised\base_automl.py", line 1083, in _fit
    trained = self.ensemble_step(
  File "c:\ProgramData\Anaconda3\lib\site-packages\supervised\base_automl.py", line 401, in ensemble_step
    self.ensemble.fit(oofs, target, sample_weight)
  File "c:\ProgramData\Anaconda3\lib\site-packages\supervised\ensemble.py", line 237, in fit
    if self.metric.improvement(previous=min_score, current=score):
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Please set a GitHub issue with above error message at: https://github.com/mljar/mljar-supervised/issues/new

## Error for Ensemble_Stacked

The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
Traceback (most recent call last):
  File "c:\ProgramData\Anaconda3\lib\site-packages\supervised\base_automl.py", line 1083, in _fit
    trained = self.ensemble_step(
  File "c:\ProgramData\Anaconda3\lib\site-packages\supervised\base_automl.py", line 401, in ensemble_step
    self.ensemble.fit(oofs, target, sample_weight)
  File "c:\ProgramData\Anaconda3\lib\site-packages\supervised\ensemble.py", line 237, in fit
    if self.metric.improvement(previous=min_score, current=score):
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Please set a GitHub issue with above error message at: https://github.com/mljar/mljar-supervised/issues/new

pplonski commented 1 year ago

Are you training and loading the model on the same machine? What version of joblib are you using?

Karlheinzniebuhr commented 1 year ago

yes on the same machine, joblib==1.2.0 When I check the Ensemble and Ensemble_Stacked folders, they are empty, which makes me think that somehow training exited prematurely?

pplonski commented 1 year ago

@Karlheinzniebuhr, that might be the reason. Do you have information about ensemble models in output logs or in metrics?

Karlheinzniebuhr commented 1 year ago

Not sure, this is the output folder if you want to take a look. https://drive.google.com/file/d/1OgSf5sKK22wCtxWo6x1gDIQDGuQTgWrs/view?usp=share_link I can attempt to make a reproducible example but my pipeline is rather long. Can it be that if the dataset is too big, even 7 days of training are too little for it to finish training in Compete mode?

mljar / mljar-supervised

ensemble.json not found error when training in Compete mode with total_time_limit #587