SeldonIO / alibi-detect

Algorithms for outlier, adversarial and drift detection
https://docs.seldon.io/projects/alibi-detect/en/stable/
Other
2.21k stars 220 forks source link

Review data_type in meta #807

Open mauicv opened 1 year ago

mauicv commented 1 year ago

Not isolated to this PR, but noting that we seem to be a little inconsistent across the new and old outlier detectors wrt to when data_type is hard-coded, and when it is optionally set via a kwarg. For some, it is hardcoded to time-series (which makes sense), for some (e.g. the old Mahalanobis) it is set via kwarg, and for some it is hard coded to numeric. Maybe worth opening an issue to review this more generally?

Already mentioned in https://github.com/SeldonIO/alibi-detect/issues/567#issuecomment-1193853273, but highlighting here since we are setting data_type in new detectors too...

_Originally posted by @ascillitoe in https://github.com/SeldonIO/alibi-detect/pull/746#discussion_r1210093971_

mauicv commented 1 year ago

In the case of the new outlier detectors, the expectation is that they're all tabular-numeric. If the user has image or text data they need to do some preprocessing first. This assumption isn't true for detectors like the old mahalanobis outlier detector which can take categorical or numeric data for instance.