mlflow / mlflow

Open source platform for the machine learning lifecycle
https://mlflow.org
Apache License 2.0
18.77k stars 4.24k forks source link

[FR] Add class_probability to the input "eval_df" of "custom_metrics" function in recipes #10323

Open e-taghizadeh opened 1 year ago

e-taghizadeh commented 1 year ago

Willingness to contribute

Yes. I would be willing to contribute this feature with guidance from the MLflow community.

Proposal Summary

For developing some metrics, one would need to use the output probability. I propose to add class_probability to the input "eval_df" of "custom_metrics" function.

Motivation

What is the use case for this feature?

For some classification algorithms, I'd like to use precision_recall_auc as the primary metric however I cannot implement it, as it requires the output probability.

Why is this use case valuable to support for MLflow users in general?

In many imbalanced datasets, these metrics are the suggested metrics in the community.

Some Thoughts

For the estimators that do not support "predict_prob" these columns could be removed or set to None in the "eval_df" DF. Currently, the precision_recall_auc or roc_auc values are automatically computed in the "evaluation" step if the estimator has some specifications, a similar approach can be used in the recipes.

Details

No response

What component(s) does this bug affect?

What interface(s) does this bug affect?

What language(s) does this bug affect?

What integration(s) does this bug affect?

github-actions[bot] commented 1 year ago

@mlflow/mlflow-team Please assign a maintainer and start triaging this issue.