TeamHG-Memex / eli5

A library for debugging/inspecting machine learning classifiers and explaining their predictions
http://eli5.readthedocs.io
MIT License
2.76k stars 334 forks source link

explain_weights under cross validation #198

Open jnothman opened 7 years ago

jnothman commented 7 years ago

I think it would be useful to have a tool which identified the most important features for a series of models trained on different data subsets. This is hard when feature extraction on transformation occurs as it is no longer easy to identify which features are involved in a big way. But in the simple case of a series of feature_importances_ or coef_s, we should have a tool to combine them in one or a few ways and report overall importance.

kmike commented 7 years ago

A built-in way to aggregare Explanation objects sounds like a good idea. I'm not sure I like the idea of providing cross-validation utilities in eli5 though. How do you see this feature, would a function to get a single Explanation from multiple Explanations work for you?

I guess DataFrame support (https://github.com/TeamHG-Memex/eli5/issues/196) could also make the problem a bit easier.

jnothman commented 7 years ago

I mean, given a list of models trained on the same set of features, assign each a weight (and perhaps an uncertainty) to be reported, perhaps with an L2 norm, perhaps L1, perhaps L0, Linf, or something else entirely

On 19 May 2017 1:30 am, "Mikhail Korobov" notifications@github.com wrote:

A built-in way to aggregare Explanation objects sounds like a good idea. I'm not sure I like the idea of providing cross-validation utilities in eli5 though. How do you see this feature, would a function to get a single Explanation from multiple Explanations work for you?

I guess DataFrame support (#196 https://github.com/TeamHG-Memex/eli5/issues/196) could also make the problem a bit easier.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/TeamHG-Memex/eli5/issues/198#issuecomment-302440762, or mute the thread https://github.com/notifications/unsubscribe-auth/AAEz64alzEXqbk0luj2rzm8-oQUc8rD0ks5r7GP8gaJpZM4NdxTw .

jnothman commented 7 years ago

And by something else I suppose I mean incorporating rank rather than value

On 19 May 2017 1:30 am, "Mikhail Korobov" notifications@github.com wrote:

A built-in way to aggregare Explanation objects sounds like a good idea. I'm not sure I like the idea of providing cross-validation utilities in eli5 though. How do you see this feature, would a function to get a single Explanation from multiple Explanations work for you?

I guess DataFrame support (#196 https://github.com/TeamHG-Memex/eli5/issues/196) could also make the problem a bit easier.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/TeamHG-Memex/eli5/issues/198#issuecomment-302440762, or mute the thread https://github.com/notifications/unsubscribe-auth/AAEz64alzEXqbk0luj2rzm8-oQUc8rD0ks5r7GP8gaJpZM4NdxTw .

rth commented 5 years ago

Visualizing the coefficients obtained in a cross-validation to evaluate their stability as done is http://gael-varoquaux.info/interpreting_ml_tuto/content/02_why/01_interpreting_linear_models.html#stability-to-gauge-significance could also be quite useful.