OutlierDetectionJL / OutlierDetection.jl

Fast, scalable and flexible Outlier Detection with Julia
https://outlierdetectionjl.github.io/OutlierDetection.jl/dev/
MIT License
79 stars 8 forks source link

How to use the `ProbabilisticDetector` wrapper with MLJ `UnsupervisedDetector`s not defined in OutlierDetection #24

Closed ablaom closed 2 years ago

ablaom commented 2 years ago

In our design interactions, I had the understanding that the wrappers ProbabilisticDetector and DeterministicDetector that we developed could be used with any MLJ UnsupervisedDetector satisfying the API set out here. But I am struggling to do that in a particular case (one-class svm provided by LIBSVM.jl).

Before presenting a MWE I wonder if I am missing something obvious. My model is a subtype of MLJModelInterface.UnsupervisedDetector and it has a MLJModelInterface.transform method that returns raw scores, and the report has a field scores for the the training scores. I thought this would be sufficient to wrap the model with ProbabilisticDetector. But when I try to fit the wrapped model, I'm running into "Failed to apply the operation augmented_transform to the machine Machine{OneClassSVM,…}" which suggests I need to implement augmented_transform (which we removed from MLJModelInterface)??

Here's the full stack trace. Let me know if I need to post a MWE.

julia> pmach = machine(pmodel, X) |> fit!
[ Info: Training Machine{ProbabilisticUnsupervisedCompositeDetector{,…},…}.
[ Info: Training Machine{OneClassSVM,…}.
┌ Error: Failed to apply the operation `augmented_transform` to the machine Machine{OneClassSVM,…}, which receives it's data arguments from one or more nodes in a learning network. Possibly, one of these nodes is delivering data that is incompatible with the machine's model.
│ Model (OneClassSVM):
│ input_scitype = Table{var"#s28"} where var"#s28"<:(AbstractVector{var"#s29"} where var"#s29"<:Continuous)
│ target_scitype =AbstractVector{var"#s3"} where var"#s3"<:OrderedFactor{2}
│ output_scitype =AbstractVector{var"#s63"} where var"#s63"<:Binary
│ 
│ Incoming data:
│ arg of augmented_transform    scitype
│ -------------------------------------------
│ Node{Nothing} AbstractMatrix{Continuous}
│ 
│ Learning network sources:
│ source     scitype
│ -------------------------------------------
│ Source @086   Nothing
└ @ MLJBase ~/.julia/packages/MLJBase/CMT6L/src/composition/learning_networks/nodes.jl:154
┌ Error: Problem fitting the machine Machine{ProbabilisticUnsupervisedCompositeDetector{,…},…}.      
└ @ MLJBase ~/.julia/packages/MLJBase/CMT6L/src/machines.jl:594
[ Info: Running type checks... 
[ Info: Type checks okay. 
ERROR: MethodError: Cannot `convert` an object of type Tuple{LIBSVM.SVM{Bool, LIBSVM.Kernel.KERNEL}, Int64} to an object of type OutlierDetectionInterface.DetectorModel
Closest candidates are:
  convert(::Type{T}, ::T) where T at essentials.jl:205
Stacktrace:
  [1] _apply(y_plus::Tuple{Node{Machine{MLJLIBSVMInterface.OneClassSVM, true}}, Machine{MLJLIBSVMInterface.OneClassSVM, true}}, input::SubArray{Float64, 2, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Base.Slice{Base.OneTo{Int64}}}, true}; kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})                   
    @ MLJBase ~/.julia/packages/MLJBase/CMT6L/src/composition/learning_networks/nodes.jl:160
  [2] _apply
    @ ~/.julia/packages/MLJBase/CMT6L/src/composition/learning_networks/nodes.jl:146 [inlined]                                                                                  
  [3] Node
    @ ~/.julia/packages/MLJBase/CMT6L/src/composition/learning_networks/nodes.jl:141 [inlined]                                                                                  
  [4] #61
    @ ~/.julia/packages/MLJBase/CMT6L/src/composition/learning_networks/nodes.jl:149 [inlined]                                                                                  
  [5] map(f::MLJBase.var"#61#62"{Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, Tuple{SubArray{Float64, 2, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Base.Slice{Base.OneTo{Int64}}}, true}}}, t::Tuple{Node{Machine{MLJLIBSVMInterface.OneClassSVM, true}}})                                      
    @ Base ./tuple.jl:213
  [6] _apply(y_plus::Tuple{Node{Nothing}}, input::SubArray{Float64, 2, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Base.Slice{Base.OneTo{Int64}}}, true}; kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})                   
    @ MLJBase ~/.julia/packages/MLJBase/CMT6L/src/composition/learning_networks/nodes.jl:148
  [7] _apply
    @ ~/.julia/packages/MLJBase/CMT6L/src/composition/learning_networks/nodes.jl:146 [inlined]                                                                                  
  [8] Node
    @ ~/.julia/packages/MLJBase/CMT6L/src/composition/learning_networks/nodes.jl:143 [inlined]                                                                                  
  [9] #61
    @ ~/.julia/packages/MLJBase/CMT6L/src/composition/learning_networks/nodes.jl:149 [inlined]                                                                                  
 [10] map(f::MLJBase.var"#61#62"{Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, Tuple{SubArray{Float64, 2, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Base.Slice{Base.OneTo{Int64}}}, true}}}, t::Tuple{Node{Nothing}})
    @ Base ./tuple.jl:213
 [11] _apply(y_plus::Tuple{Node{Nothing}}, input::SubArray{Float64, 2, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Base.Slice{Base.OneTo{Int64}}}, true}; kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})                   
    @ MLJBase ~/.julia/packages/MLJBase/CMT6L/src/composition/learning_networks/nodes.jl:148
 [12] _apply
    @ ~/.julia/packages/MLJBase/CMT6L/src/composition/learning_networks/nodes.jl:146 [inlined]                                                                                  
 [13] Node
    @ ~/.julia/packages/MLJBase/CMT6L/src/composition/learning_networks/nodes.jl:143 [inlined]                                                                                  
 [14] return_with_scores!(network_mach::Machine{ProbabilisticUnsupervisedDetectorSurrogate, false}, model::OutlierDetection.ProbabilisticUnsupervisedCompositeDetector{(:detector,), Table{var"#s28"} where var"#s28"<:(AbstractVector{var"#s29"} where var"#s29"<:Continuous)}, verbosity::Int64, scores_train::Node{Nothing}, X::SubArray{Float64, 2, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Base.Slice{Base.OneTo{Int64}}}, true})
    @ OutlierDetection ~/.julia/packages/OutlierDetection/PyXhg/src/mlj_wrappers.jl:156
 [15] fit(model::OutlierDetection.ProbabilisticUnsupervisedCompositeDetector{(:detector,), Table{var"#s28"} where var"#s28"<:(AbstractVector{var"#s29"} where var"#s29"<:Continuous)}, verbosity::Int64, X::SubArray{Float64, 2, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Base.Slice{Base.OneTo{Int64}}}, true})
    @ OutlierDetection ~/.julia/packages/OutlierDetection/PyXhg/src/mlj_wrappers.jl:201
 [16] fit_only!(mach::Machine{OutlierDetection.ProbabilisticUnsupervisedCompositeDetector{(:detector,), Table{var"#s28"} where var"#s28"<:(AbstractVector{var"#s29"} where var"#s29"<:Continuous)}, true}; rows::Nothing, verbosity::Int64, force::Bool)
    @ MLJBase ~/.julia/packages/MLJBase/CMT6L/src/machines.jl:592
 [17] fit_only!
    @ ~/.julia/packages/MLJBase/CMT6L/src/machines.jl:545 [inlined]
 [18] #fit!#56
    @ ~/.julia/packages/MLJBase/CMT6L/src/machines.jl:659 [inlined]
 [19] fit!
    @ ~/.julia/packages/MLJBase/CMT6L/src/machines.jl:657 [inlined]
 [20] |>(x::Machine{OutlierDetection.ProbabilisticUnsupervisedCompositeDetector{(:detector,), Table{var"#s28"} where var"#s28"<:(AbstractVector{var"#s29"} where var"#s29"<:Continuous)}, true}, f::typeof(fit!))  
    @ Base ./operators.jl:858
 [21] top-level scope
    @ REPL[151]:1

caused by: MethodError: Cannot `convert` an object of type Tuple{LIBSVM.SVM{Bool, LIBSVM.Kernel.KERNEL}, Int64} to an object of type OutlierDetectionInterface.DetectorModel
Closest candidates are:
  convert(::Type{T}, ::T) where T at essentials.jl:205
Stacktrace:
  [1] cvt1
    @ ./essentials.jl:322 [inlined]
  [2] ntuple
    @ ./ntuple.jl:49 [inlined]
  [3] convert(#unused#::Type{Tuple{OutlierDetectionInterface.DetectorModel, AbstractVector{var"#s2"} where var"#s2"<:Real}}, x::Tuple{Tuple{LIBSVM.SVM{Bool, LIBSVM.Kernel.KERNEL}, Int64}, Vector{Float64}})                                                      
    @ Base ./essentials.jl:323
  [4] to_fitresult(mach::Machine{MLJLIBSVMInterface.OneClassSVM, true})
    @ OutlierDetection ~/.julia/packages/OutlierDetection/PyXhg/src/mlj_helpers.jl:87
  [5] augmented_transform(mach::Machine{MLJLIBSVMInterface.OneClassSVM, true}, X::SubArray{Float64, 2, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Base.Slice{Base.OneTo{Int64}}}, true})
    @ OutlierDetection ~/.julia/packages/OutlierDetection/PyXhg/src/mlj_helpers.jl:146
  [6] _apply(y_plus::Tuple{Node{Machine{MLJLIBSVMInterface.OneClassSVM, true}}, Machine{MLJLIBSVMInterface.OneClassSVM, true}}, input::SubArray{Float64, 2, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Base.Slice{Base.OneTo{Int64}}}, true}; kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})                   
    @ MLJBase ~/.julia/packages/MLJBase/CMT6L/src/composition/learning_networks/nodes.jl:152
  [7] _apply
    @ ~/.julia/packages/MLJBase/CMT6L/src/composition/learning_networks/nodes.jl:146 [inlined]                                                                                  
  [8] Node
    @ ~/.julia/packages/MLJBase/CMT6L/src/composition/learning_networks/nodes.jl:141 [inlined]                                                                                  
  [9] #61
    @ ~/.julia/packages/MLJBase/CMT6L/src/composition/learning_networks/nodes.jl:149 [inlined]                                                                                  
 [10] map(f::MLJBase.var"#61#62"{Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, Tuple{SubArray{Float64, 2, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Base.Slice{Base.OneTo{Int64}}}, true}}}, t::Tuple{Node{Machine{MLJLIBSVMInterface.OneClassSVM, true}}})                                      
    @ Base ./tuple.jl:213
 [11] _apply(y_plus::Tuple{Node{Nothing}}, input::SubArray{Float64, 2, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Base.Slice{Base.OneTo{Int64}}}, true}; kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})                   
    @ MLJBase ~/.julia/packages/MLJBase/CMT6L/src/composition/learning_networks/nodes.jl:148
 [12] _apply
    @ ~/.julia/packages/MLJBase/CMT6L/src/composition/learning_networks/nodes.jl:146 [inlined]                                                                                  
 [13] Node
    @ ~/.julia/packages/MLJBase/CMT6L/src/composition/learning_networks/nodes.jl:143 [inlined]                                                                                  
 [14] #61
    @ ~/.julia/packages/MLJBase/CMT6L/src/composition/learning_networks/nodes.jl:149 [inlined]                                                                                  
 [15] map(f::MLJBase.var"#61#62"{Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}}, Tuple{SubArray{Float64, 2, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Base.Slice{Base.OneTo{Int64}}}, true}}}, t::Tuple{Node{Nothing}})
    @ Base ./tuple.jl:213
 [16] _apply(y_plus::Tuple{Node{Nothing}}, input::SubArray{Float64, 2, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Base.Slice{Base.OneTo{Int64}}}, true}; kwargs::Base.Iterators.Pairs{Union{}, Union{}, Tuple{}, NamedTuple{(), Tuple{}}})                   
    @ MLJBase ~/.julia/packages/MLJBase/CMT6L/src/composition/learning_networks/nodes.jl:148
 [17] _apply
    @ ~/.julia/packages/MLJBase/CMT6L/src/composition/learning_networks/nodes.jl:146 [inlined]                                                                                  
 [18] Node
    @ ~/.julia/packages/MLJBase/CMT6L/src/composition/learning_networks/nodes.jl:143 [inlined]                                                                                  
 [19] return_with_scores!(network_mach::Machine{ProbabilisticUnsupervisedDetectorSurrogate, false}, model::OutlierDetection.ProbabilisticUnsupervisedCompositeDetector{(:detector,), Table{var"#s28"} where var"#s28"<:(AbstractVector{var"#s29"} where var"#s29"<:Continuous)}, verbosity::Int64, scores_train::Node{Nothing}, X::SubArray{Float64, 2, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Base.Slice{Base.OneTo{Int64}}}, true})
    @ OutlierDetection ~/.julia/packages/OutlierDetection/PyXhg/src/mlj_wrappers.jl:156
 [20] fit(model::OutlierDetection.ProbabilisticUnsupervisedCompositeDetector{(:detector,), Table{var"#s28"} where var"#s28"<:(AbstractVector{var"#s29"} where var"#s29"<:Continuous)}, verbosity::Int64, X::SubArray{Float64, 2, Matrix{Float64}, Tuple{Base.Slice{Base.OneTo{Int64}}, Base.Slice{Base.OneTo{Int64}}}, true})
    @ OutlierDetection ~/.julia/packages/OutlierDetection/PyXhg/src/mlj_wrappers.jl:201
 [21] fit_only!(mach::Machine{OutlierDetection.ProbabilisticUnsupervisedCompositeDetector{(:detector,), Table{var"#s28"} where var"#s28"<:(AbstractVector{var"#s29"} where var"#s29"<:Continuous)}, true}; rows::Nothing, verbosity::Int64, force::Bool)
    @ MLJBase ~/.julia/packages/MLJBase/CMT6L/src/machines.jl:592
 [22] fit_only!
    @ ~/.julia/packages/MLJBase/CMT6L/src/machines.jl:545 [inlined]
 [23] #fit!#56
    @ ~/.julia/packages/MLJBase/CMT6L/src/machines.jl:659 [inlined]
 [24] fit!
    @ ~/.julia/packages/MLJBase/CMT6L/src/machines.jl:657 [inlined]
 [25] |>(x::Machine{OutlierDetection.ProbabilisticUnsupervisedCompositeDetector{(:detector,), Table{var"#s28"} where var"#s28"<:(AbstractVector{var"#s29"} where var"#s29"<:Continuous)}, true}, f::typeof(fit!))
    @ Base ./operators.jl:858
 [26] top-level scope
    @ REPL[151]:1
davnn commented 2 years ago

The problem was that there was an assumption that the returned models from fit subtype from DetectorModel, which is unnecessary in retrospect and I changed it accordingly. I added tests for all features with MMI-based detectors to catch such problems in the future.

davnn commented 2 years ago

Problem is fixed with 0.2.6, but TagBot is currently failing due to some issue with mkdocs. I will retrigger tagging once https://github.com/mkdocs/mkdocs/issues/2799 is fixed.

davnn commented 2 years ago

https://github.com/OutlierDetectionJL/OutlierDetection.jl/releases/tag/v0.2.6 is now available.

ablaom commented 2 years ago

Thanks for the rapid response @davnn. I can confirm that the issue is resolved at my end.