aimclub / FEDOT

Automated modeling and machine learning framework FEDOT
https://fedot.readthedocs.io
BSD 3-Clause "New" or "Revised" License
627 stars 86 forks source link

Bugs category features error / pipeline fitting #440

Closed aPovidlo closed 2 years ago

aPovidlo commented 2 years ago

Branch to repeating bugs - raifhack_bug Input is table dataset of the sale of commercial real estate from Raifhack DS. Model should predicts price changes. Train data was reduced by half for uploading to the repository. Problem: running through the api causes next errors.

FEDOT params:

model = Fedot(problem=problem, seed=1518, composer_params={'metric': ['mape'], 'pop_size': 6, 'with_tuning': False},
                         verbose_level=3, timeout=1)
pipeline = model.fit(features=X_offer, target=y_offer)

1. Category features error After composition done, there is next error:

Composition time: 19.168 min
Traceback (most recent call last):
  File "C:\Users\andre\Documents\GitHub\FEDOT\fedot\core\optimisers\gp_comp\param_free_gp_optimiser.py", line 78, in optimise
    self._evaluate_individuals(self.population, objective_function, timer=t)
  File "C:\Users\andre\Documents\GitHub\FEDOT\fedot\core\optimisers\gp_comp\gp_optimiser.py", line 397, in _evaluate_individuals
    evaluate_individuals(individuals_set=individuals_set, objective_function=objective_function,
  File "C:\Users\andre\Documents\GitHub\FEDOT\fedot\core\optimisers\gp_comp\gp_operators.py", line 127, in evaluate_individuals
    raise AttributeError('Too much fitness evaluation errors. Composing stopped.')
AttributeError: Too much fitness evaluation errors. Composing stopped.

2. Pipline fitting If you try runnig without categories feature.

Fit pipeline from scratch
  0%|          | 0/1000 [00:00<?, ?trial/s, best loss=?]Fit pipeline from scratch
  0%|          | 1/1000 [03:06<51:45:03, 186.49s/trial, best loss: 93291.69430599359]
Fit pipeline from scratch
Hyperparameters optimization finished
Return init pipeline due to the fact that obtained metric 93291.694 bigger than initial (+ 5% deviation) 88659.155
Tuning was finished
Model composition finished
Fit pipeline from scratch
Traceback (most recent call last):
  File "C:\Program Files\JetBrains\PyCharm 2019.2.1\helpers\pydev\pydevd.py", line 1415, in _exec
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "C:\Program Files\JetBrains\PyCharm 2019.2.1\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "C:/Users/andre/Documents/GitHub/FEDOT/raifhack_main.py", line 220, in <module>
    prediction = model.predict(features=test_data)
  File "C:\Users\andre\Documents\GitHub\FEDOT\fedot\api\main.py", line 131, in predict
    self.prediction = self.helper.define_predictions(task_type=self.composer_dict['task'].task_type,
  File "C:\Users\andre\Documents\GitHub\FEDOT\fedot\api\api_utils\data.py", line 41, in define_predictions
    prediction = current_pipeline.predict(test_data)
  File "C:\Users\andre\Documents\GitHub\FEDOT\fedot\core\pipelines\pipeline.py", line 250, in predict
    result = self.root_node.predict(input_data=copied_input_data, output_mode=output_mode)
  File "C:\Users\andre\Documents\GitHub\FEDOT\fedot\core\pipelines\node.py", line 269, in predict
    secondary_input = self._input_from_parents(input_data=input_data,
  File "C:\Users\andre\Documents\GitHub\FEDOT\fedot\core\pipelines\node.py", line 283, in _input_from_parents
    parent_results, target = _combine_parents(parent_nodes, input_data,
  File "C:\Users\andre\Documents\GitHub\FEDOT\fedot\core\pipelines\node.py", line 317, in _combine_parents
    prediction = parent.predict(input_data=input_data)
  File "C:\Users\andre\Documents\GitHub\FEDOT\fedot\core\pipelines\node.py", line 269, in predict
    secondary_input = self._input_from_parents(input_data=input_data,
  File "C:\Users\andre\Documents\GitHub\FEDOT\fedot\core\pipelines\node.py", line 283, in _input_from_parents
    parent_results, target = _combine_parents(parent_nodes, input_data,
  File "C:\Users\andre\Documents\GitHub\FEDOT\fedot\core\pipelines\node.py", line 317, in _combine_parents
    prediction = parent.predict(input_data=input_data)
  File "C:\Users\andre\Documents\GitHub\FEDOT\fedot\core\pipelines\node.py", line 211, in predict
    return super().predict(input_data, output_mode)
  File "C:\Users\andre\Documents\GitHub\FEDOT\fedot\core\pipelines\node.py", line 124, in predict
    operation_predict = self.operation.predict(fitted_operation=self.fitted_operation,
  File "C:\Users\andre\Documents\GitHub\FEDOT\fedot\core\operations\operation.py", line 109, in predict
    prediction = self._eval_strategy.predict(
  File "C:\Users\andre\Documents\GitHub\FEDOT\fedot\core\operations\evaluation\common_preprocessing.py", line 54, in predict
    prediction = trained_operation.transform(predict_data,
  File "C:\Users\andre\Documents\GitHub\FEDOT\fedot\core\operations\evaluation\operation_implementations\data_operations\sklearn_transformations.py", line 338, in transform
    categorical_features = self.imputer_cat.transform(categorical.features)
  File "C:\Anaconda3\envs\fedot\lib\site-packages\sklearn\impute\_base.py", line 430, in transform
    check_is_fitted(self)
  File "C:\Anaconda3\envs\fedot\lib\site-packages\sklearn\utils\validation.py", line 63, in inner_f
    return f(*args, **kwargs)
  File "C:\Anaconda3\envs\fedot\lib\site-packages\sklearn\utils\validation.py", line 1098, in check_is_fitted
    raise NotFittedError(msg % {'name': type(estimator).__name__})
sklearn.exceptions.NotFittedError: This SimpleImputer instance is not fitted yet. Call 'fit' with appropriate arguments before using this estimator.
nicl-nno commented 2 years ago

@Dreamlone

А это починено в новом PR по препроцессингу?

Dreamlone commented 2 years ago

Не знаю, пока не проверял, но думаю можно будет провалидировать как починю тесты в PR по препроцессингу

nicl-nno commented 2 years ago

Можно валидировать, кажется.

Dreamlone commented 2 years ago

Update, получаю вот такую ошибку image

то есть пока не исправлено. Хотя и описанные выше ошибки ушли похоже

Dreamlone commented 2 years ago

Накатил изменения в ветку с #515, проверил, теперь всё работает