Open angela97lin opened 3 years ago
One pro of this approach is that we don't need to store the indices to remove in the output of the data check, since we need to recalculate here. This means that it is incredibly important that the two methods of detecting nan rows are at parity at all times.
https://github.com/alteryx/evalml/pull/2692 introduced a generic
DropRowsTransformer
but based on the thread here, it'd be a good idea to introduce a data-agnostic component which detects and drops nan rows.This could probably be done by subclassing
DropRowsTransformer
and adding nan detection logic!