JuliaAI / MLJ.jl

A Julia machine learning framework
https://juliaai.github.io/MLJ.jl/
Other
1.78k stars 157 forks source link

Improved feature importance support #747

Open ablaom opened 3 years ago

ablaom commented 3 years ago

The MLJ model API only says that model reporting feature importances should report them in the report output by fit. But it says nothing about the actual format of this output, and I can see inconsistencies in the implementations. Feature importances are used by some meta-alogorithms, such as RecursiveFeatureElimination (#426) so this might be worth sorting out.

I propose adding a new method feature_importance(model::Model, report) to the model API to report the scores, according to some fixed convention. ~~Some models (e.g., LightGBM models) report multiple types of importance scores. So I propose this method return a named tuple keyed on the type, whose values are Float64 vectors.~~

edit See suggestion for format below.

edit The proposal follows that same interface patter that we have already for training_losses.

Thoughts anyone?

TODO:

ablaom commented 3 years ago

cc @boliu-christine

ablaom commented 2 years ago

Here's an update on my suggestion for the format of feature importances, as returned by the proposed method feature_importances(model, report).

I think allowing models to expose multiple types of feature importance is overkill / excessively complicated. Of course multiple scores can still be declared in the report itself.

So I suggest a vector of name => float pairs, where name is a symbol:

v= [:gender =>0.23, :height =>, :weight => 0.1] 
zsz00 commented 2 years ago

What is the current state of this ?? I need feature importance support !

OkonSamuel commented 2 years ago

What is the current state of this ?? I need feature importance support !

Am still working on this. Will be done soon.

zsz00 commented 2 years ago

What is the current state of this ? @OkonSamuel