[Bug]: Invalid fitness after objective evaluation. Skipping the graph: (/n_scaling;)/n_rf_{'n_jobs':32} - Githubissues

aimclub / FEDOT

Automated modeling and machine learning framework FEDOT

https://fedot.readthedocs.io

BSD 3-Clause "New" or "Revised" License

619 stars 84 forks source link

[Bug]: Invalid fitness after objective evaluation. Skipping the graph: (/n_scaling;)/n_rf_{'n_jobs':32} #1273

Open DRMPN opened 3 months ago

DRMPN commented 3 months ago

Expected Behavior

The method calculates the Roc Auc score for a target column of type bool in a tabular data classification problem.

Current Behavior

For some reason get_metrics() method fails after successfully fitting the data.

There seems to be a problem with the tabular data preprocessing in the target, surprisingly it's an empty array.

This is likely just the tip of the iceberg, as the `num_classes' method is already parameterized with an empty array:

Possible Solution

Use debugging to find the place in the source code where this transformation does not take place.
Make appropriate changes to the code and check that the proposed solution works correctly.
Prepare unit tests.

Steps to Reproduce

Download the data from https://www.kaggle.com/competitions/spaceship-titanic

Create and run Jupyter Notebook using the following snippet (imports and paths are omitted for simplicity):

train = pd.read_csv("train.csv")
test = pd.read_csv("test.csv")
model = Fedot(problem='classification', metric='roc_auc', preset="best_quality")
best_pipeline = model.fit(features=train, target='Transported')
prediction = model.predict(features=test)
model.plot_prediction()
model.get_metrics()

Context [OPTIONAL]

The target column is loaded as a bool type.

Documentation References: https://fedot.readthedocs.io/en/latest/introduction/fedot_features/main_features.html https://fedot.readthedocs.io/en/latest/advanced/data_preprocessing.html

nicl-nno commented 3 months ago

Related problem: https://github.com/aimclub/FEDOT/pull/1274