rikhuijzer / SIRUS.jl

Interpretable Machine Learning via Rule Extraction
https://sirus.jl.huijzer.xyz/
MIT License
30 stars 2 forks source link

"Zero rules" error #85

Open ericphanson opened 5 months ago

ericphanson commented 5 months ago
┌ Error: Failed to apply the operation `predict` to the machine machine(:stable_rules_classifier, …), which receives it's data arguments from one or more nodes in a learning network. Possibly, one of these nodes is delivering data that is incompatible with the machine's model.
│ Model (stable_rules_classifier):
│ input_scitype = Unknown
│ target_scitype =Unknown
│ output_scitype =Unknown
│ 
│ Incoming data:
│ arg of predict        scitype
│ -------------------------------------------
│ Node @972 → :standardizer     Table{AbstractVector{Continuous}}
│ 
│ Learning network sources:
│ source        scitype
│ -------------------------------------------
│ Source @363   Table{AbstractVector{Continuous}}
│ Source @476   AbstractVector{OrderedFactor{2}}
│ Source @785   AbstractVector{Continuous}
└ @ MLJBase ~/.julia/packages/MLJBase/iIhiI/src/composition/learning_networks/nodes.jl:155
ERROR: LoadError: Zero rules
Stacktrace:
  [1] error(s::String)
    @ Base ./error.jl:35
  [2] predict(model::SIRUS.MLJImplementation.StableRulesClassifier, fitresult::StableRules{Float64}, Xnew::DataFrame)
    @ SIRUS.MLJImplementation ~/.julia/packages/SIRUS/6Paa4/src/mlj.jl:259
  [3] predict(mach::Machine{Symbol, true}, Xraw::DataFrame)
    @ MLJBase ~/.julia/packages/MLJBase/iIhiI/src/operations.jl:133
  [4] _apply(y_plus::Tuple{Node{Machine{Symbol, true}}, Machine{Symbol, true}}, input::DataFrame; kwargs::@Kwargs{})
    @ MLJBase ~/.julia/packages/MLJBase/iIhiI/src/composition/learning_networks/nodes.jl:151

I frequently get this issue when trying to use SIRUS.jl. I can often avoid it by changing weights to affect class-balance or subsampling etc, or changing lambda, but it is confusing and I don't know exactly what causes it or how to avoid it.

rikhuijzer commented 5 months ago

Thanks for trying SIRUS.jl out and opening an issue, @ericphanson

It sounds like a weird issue indeed. The error comes from

https://github.com/rikhuijzer/SIRUS.jl/blob/ce05c1d81df8adffd70aae4974962c766947b08d/src/mlj.jl#L254-L261

So apparently the model has fitted zero rules during (one of) the training runs. That sounds like a bug somewhere because I would expect that the model would fit at least some rules on any dataset. Is this a part of cross-validation runs? Can you give a bit more information about the data size and what type of predictor you have?

ericphanson commented 5 months ago

Yeah, I'm doing 5-fold CV. I have 8 numeric features and ~150k rows, in a binary classification problem.