Open hpaldan opened 1 year ago
I totally didn't understand that the arrows was for comments.. Rookie mistake.
I still hade to make some minor fixes on my bad description.
Hey! The reason might be that pipelines only support
const SUPPORTED_TYPES_FOR_PIPELINES = [
:Deterministic,
:Probabilistic,
:Interval,
:Unsupervised,
:Static]
models, but outlier detection algorithms are currently modeled as a separate entity (Annotator <: Model
) in MLJ.
Detector
directly from Unsupervised or Supervised, but that too would require some major changes.In the meantime, however, you could directly use the learning networks API to achieve your desired pipeline:
using MLJ
using OutlierDetection
using DataFrames
fake_dataframe = DataFrame(A=rand(100) .- 10 .* 10, B=rand(100) .+ 10 .* 10)
#Load models
LOF = @iload LOFDetector() pkg = OutlierDetectionNeighbors
IForest = @iload IForestDetector() pkg = OutlierDetectionPython
#Learning networks
Xs = source(fake_dataframe)
Xstd = MLJ.transform(machine(Standardizer(), Xs), Xs)
lof_mach = MLJ.transform(machine(LOF(), Xstd), Xstd)
forest_mach = MLJ.transform(machine(IForest(), Xstd), Xstd)
fit!(lof_mach)
lof_mach(fake_dataframe)
fit!(forest_mach)
forest_mach(fake_dataframe)
All right, too bad that the fix would require that much work. Thank you for the fast reply and good guidance!
Describe the bug
I have a problem using the UnsupervisedDetector models in a pipeline. I have tried a two different simple linear pipelines, one with a standardizer and LOFDetector and one with standardizer and IForestDetector. It seems like the fit! function doesn't work properly on the detector models when they are in a pipeline since no training seems to take place and when I try to transform new data with the machine it gives an error message: "ERROR: MethodError: objects of type OutlierDetectionPython.IForestDetector are not callable"
To Reproduce
Hopefully the code example isn't too long.
Expected behavior
I expect the transform function to output an anomaly score from a machine that first standardizes the data and then do some kinde of detector model on it.
Additional context
I have tried the same thing (as is in the code above) with other unsupervised models and it seems to work fine on them so the problem is probably isolated to the OutlierDetection package. I've also tried a PCA model instead of a standardizer with a oulierdetection model in a pipeline with the same problem.
Versions