alteryx / evalml

EvalML is an AutoML library written in python.
https://evalml.alteryx.com
BSD 3-Clause "New" or "Revised" License
734 stars 83 forks source link

Invalid target data check: recommended action should be to drop rows with nans #3247

Open dsherry opened 2 years ago

dsherry commented 2 years ago

Looks like the default action right now is "impute" for regression https://github.com/alteryx/evalml/blob/main/evalml/data_checks/invalid_target_data_check.py#L247

I think a) one of the actions should be to drop rows with missing target values and b) this should be the recommended action.

Why: imputing the target is cool and can apply interesting modeling pressure in some cases. But rewriting the target is also dangerous!

chukarsten commented 2 years ago

For clarity's sake, the purpose of this issue is to simply change the default action to dropping rows with null target values.