When we have a dataset with too many columns and a significant number of rows, instead of turning on quick_learn now set disable_column_importance to True, this should remove one of the costliest parts of the model analysis step for these types of dataset.
A column can now have a broken flag (None by default, otherwise containing 2 keys: failed_at and reason), a "broken" column is one for which we fail to detect a type of subtype, usually this should only happen because the sampled values were all Nan or None. These "broken" columns are the ignored in the same way we ignore columns when they are identifiers or when the user asks us too.
quick_learn
now setdisable_column_importance
toTrue
, this should remove one of the costliest parts of the model analysis step for these types of dataset.broken
flag (None
by default, otherwise containing 2 keys:failed_at
andreason
), a "broken" column is one for which we fail to detect a type of subtype, usually this should only happen because the sampled values were all Nan or None. These "broken" columns are the ignored in the same way we ignore columns when they are identifiers or when the user asks us too.