alteryx / evalml

EvalML is an AutoML library written in python.
https://evalml.alteryx.com
BSD 3-Clause "New" or "Revised" License
735 stars 83 forks source link

Create DataCheck for Unknown types #2478

Open bchen1116 opened 3 years ago

bchen1116 commented 3 years ago

This issue was brought up here as we integrate the new WW update into EvalML. Primarily, we want to raise a datacheck warning/error when the dataset a user passed in has a large number of Unknown-type data since we drop these columns in AutoMLSearch.

dsherry commented 2 years ago

@bchen1116 got it, agreed having a data check which raises a warning if there's a ton of unknown-typed features would be helpful. Not required to support unknown types in evalml though, right?

dsherry commented 2 years ago

We agree it would be helpful to have a data check which alerts users if they're trying to model with any unknown-typed columns. The "unknown" type is intended to designate the case where type inference was unable to determine the most likely type of the column and the user must tell our code what type that column should have.

So, two thoughts