Open ablaom opened 3 years ago
cc @boliu-christine
Here's an update on my suggestion for the format of feature importances, as returned by the proposed method feature_importances(model, report)
.
I think allowing models to expose multiple types of feature importance is overkill / excessively complicated. Of course multiple scores can still be declared in the report itself.
So I suggest a vector of name => float
pairs, where name
is a symbol:
v= [:gender =>0.23, :height =>, :weight => 0.1]
What is the current state of this ?? I need feature importance support !
What is the current state of this ?? I need feature importance support !
Am still working on this. Will be done soon.
What is the current state of this ? @OkonSamuel
The MLJ model API only says that model reporting feature importances should report them in the
report
output byfit
. But it says nothing about the actual format of this output, and I can see inconsistencies in the implementations. Feature importances are used by some meta-alogorithms, such as RecursiveFeatureElimination (#426) so this might be worth sorting out.I propose adding a new method
feature_importance(model::Model, report)
to the model API to report the scores, according to some fixed convention. ~~Some models (e.g., LightGBM models) report multiple types of importance scores. So I propose this method return a named tuple keyed on the type, whose values areFloat64
vectors.~~edit See suggestion for format below.
edit The proposal follows that same interface patter that we have already for
training_losses
.Thoughts anyone?
TODO:
reports_feature_importances
trait to StatisticalTraits, defaulting tofalse
feature_importances(model, report)
stub to MLJModelInterface (in model_api.jl); fallback to returnnothing
.MMI.feature_importance(mach::Machine)
following this patternreport
, and to implement the above method and trait. See https://github.com/JuliaAI/MLJScikitLearnInterface.jl/issues/30 and https://github.com/JuliaAI/MLJScikitLearnInterface.jl/issues/26