aimclub / FEDOT

Automated modeling and machine learning framework FEDOT
https://fedot.readthedocs.io
BSD 3-Clause "New" or "Revised" License
619 stars 84 forks source link

[Bug]: Invalid fitness after objective evaluation. Skipping the graph: (/n_scaling;)/n_rf_{'n_jobs':32} #1273

Open DRMPN opened 3 months ago

DRMPN commented 3 months ago

Expected Behavior

The method calculates the Roc Auc score for a target column of type bool in a tabular data classification problem.

Current Behavior

For some reason get_metrics() method fails after successfully fitting the data. image

There seems to be a problem with the tabular data preprocessing in the target, surprisingly it's an empty array. image

This is likely just the tip of the iceberg, as the `num_classes' method is already parameterized with an empty array: image

Possible Solution

Steps to Reproduce

  1. Download the data from https://www.kaggle.com/competitions/spaceship-titanic
  2. Create and run Jupyter Notebook using the following snippet (imports and paths are omitted for simplicity):
    train = pd.read_csv("train.csv")
    test = pd.read_csv("test.csv")
    model = Fedot(problem='classification', metric='roc_auc', preset="best_quality")
    best_pipeline = model.fit(features=train, target='Transported')
    prediction = model.predict(features=test)
    model.plot_prediction()
    model.get_metrics()

Context [OPTIONAL]

The target column is loaded as a bool type. image

Documentation References: https://fedot.readthedocs.io/en/latest/introduction/fedot_features/main_features.html https://fedot.readthedocs.io/en/latest/advanced/data_preprocessing.html

nicl-nno commented 3 months ago

Related problem: https://github.com/aimclub/FEDOT/pull/1274