OutlierDetectionJL / OutlierDetection.jl

Fast, scalable and flexible Outlier Detection with Julia
https://outlierdetectionjl.github.io/OutlierDetection.jl/dev/
MIT License
79 stars 8 forks source link

Unsupervised detectors and missing values in provided target #34

Closed ablaom closed 1 year ago

ablaom commented 1 year ago

As I recall, we allow the specification of a target y when training an unsupervised detector, even though it is ignored, so that we may buy into MLJ's evaluate apparatus: If you have test labels, but no train labels, you just pad y with missings in the training indices.

It seems to me that the target_scitype should therefore be AbstractVector{<:Union{OrderedFactor{2},Missing}} in those cases, but I see that it is actually AbstractVector{OrderedFactor{2}} (at least it is for ABODDetector that I have been playing with). The bottom line is that MLJ is complaining when checking scitypes for these models, if the target presented has missings. (Actually, MLJBase <= 0.19.8 is just throwing an error, because of a bug; MLJBase >=0.20 would throw a warning, I guess, if it was compatible with OutlierDetection, which is isn't currently.)

@davnn Do you see any reason not to expand the scitype?

@JosephsDavid

davnn commented 1 year ago

Hi, I recall it as a temporary fix, because missing values were causing some problems when using other MLJ features, e.g. ensemble models, but I think you have addressed that in https://github.com/JuliaAI/MLJEnsembles.jl/commit/3e55cb6e85ff2dfb3fbfc3fb6f86b62ee4fea138. Otherwise, I think we can expand the scitype and fix potential problems in MLJ down the line.

ablaom commented 1 year ago

but I think you have addressed that in https://github.com/JuliaAI/MLJEnsembles.jl/commit/3e55cb6e85ff2dfb3fbfc3fb6f86b62ee4fea138.

Yes, I see. That should be fixed as you say. And if for some reason it's not, then that should be regarded as a bug to be promptly addressed.

Are you able to make the necessary PR to expand the scitypes?

davnn commented 1 year ago

I will expand the scitypes for the next release (which should be 0.20 compatible).

davnn commented 1 year ago

This point is also released now with https://github.com/OutlierDetectionJL/OutlierDetectionInterface.jl/releases/tag/v0.1.9