JuliaAI / MLJModelInterface.jl

Lightweight package to interface with MLJ
MIT License
37 stars 8 forks source link

Interface broken #171

Closed juliohm closed 1 year ago

juliohm commented 1 year ago

Can you please clarify what changed that broke the fit interface?

julia> MMI.fit(tree, 0, X, y)
ERROR: MethodError: no method matching fit(::MLJDecisionTreeInterface.DecisionTreeClassifier, ::Int64, ::TableTransforms.TableSelection{TypedTables.Table{NamedTuple{(:X, :Y, :Z), Tuple{Float64, Float64, CategoricalValue{Int64, UInt32}}}, 1, NamedTuple{(:X, :Y, :Z), Tuple{Vector{Float64}, Vector{Float64}, CategoricalVector{Int64, UInt32, Int64, CategoricalValue{Int64, UInt32}, Union{}}}}}, NamedTuple{(:X, :Y, :Z), Tuple{Vector{Float64}, Vector{Float64}, CategoricalVector{Int64, UInt32, Int64, CategoricalValue{Int64, UInt32}, Union{}}}}}, ::CategoricalVector{Int64, UInt32, Int64, CategoricalValue{Int64, UInt32}, Union{}})
Closest candidates are:
  fit(::MLJDecisionTreeInterface.DecisionTreeClassifier, ::Int64, ::Any, ::Any, ::Any, ::Any) at ~/.julia/packages/MLJDecisionTreeInterface/OtrDc/src/MLJDecisionTreeInterface.jl:45
  fit(::Supervised, ::Any, ::Any, ::Any, ::Any) at ~/.julia/packages/MLJModelInterface/gw4ML/src/model_api.jl:16
  fit(::NetworkComposite, ::Any, ::Any...) at ~/.julia/packages/MLJBase/97P9U/src/composition/models/network_composite.jl:23
  ...

This code was working fine in previous releases where X is a table and y is a vector.

ablaom commented 1 year ago

Thanks for reporting @juliohm.

Yes, MLJDecisionTreeInterface.jl 0.3.1 should have been tagged breaking. The issue was previously reported here. We are working on 0.4 and will be yanking 0.3.1. But you should be able to make a backwards compatible fix in any case.

What changed is that trees implemented the MLJModelInterface data-front end.

Adding the model-specific pre-processor reformat to your fit/predict calls

If you are not already using reformat in your fit and predict calls then, where you previously made a call

MMI.fit(model, verbosity, data...)    # MMI = MLJModelInterface

you instead want

MMI.fit(model, verbosity, MMI.reformat(model, data...)...)

And instead of

MMI.predict(model, fitresult, Xnew)

you want

MMI.predict(model, fitresult, MMI.reformat(model, Xnew)...)

You have backwards compatibility because the fallback for reformat just slurps the data.

Subsampling

If you subsample reformatted data before passing to fit or predict, you should always use subsampled_data = selectrows(model, I, reformatted_data...) where I is the indices for subsampling.

Length of data issue

If you already includereformat in your fit/predict calls you should provide more detail to help me diagnose the problem. In the case cited above the issue was an assumption about the number of arguments output by reformat and the fix is here which you may want to check out in any case.

ablaom commented 1 year ago

Closing in favor of https://github.com/JuliaAI/MLJDecisionTreeInterface.jl/issues/51 but @juliohm feel free to continue discussion in either place.