dsgibbons / shap

A game theoretic approach to explain the output of any machine learning model.
https://shap-community.readthedocs.io/en/latest/
MIT License
25 stars 5 forks source link

add documentation notebook violin plot #97

Closed adnene-guessoum closed 1 year ago

adnene-guessoum commented 1 year ago

Related to Issue https://github.com/dsgibbons/shap/issues/55, I was looking for some low hanging fruits to start contributing and found this good first issue for an API example notebook related to violin plots.

I followed the example from the original package ( https://shap.readthedocs.io/en/latest/example_notebooks/tabular_examples/tree_based_models/Scatter%20Density%20vs.%20Violin%20Plot%20Comparison.html?highlight=summary%20plot#Layered-violin-plot ) and the beeswarm general template.

Hope it's fine.

adnene-guessoum commented 1 year ago

Thank you very much for the review. I will make the requested changes and resubmit.

adnene-guessoum commented 1 year ago

So it turns out my low hanging fruit had a worm in it, and now i am afraid to bite...

If i may ask, so that I understand fully. As of now, the violin plot does not accept Explanation objects, right ?

1) It appears to me that an Explanation object doesn't have expected_values :

https://github.com/dsgibbons/shap/blob/0821d2e1ea073562da7763236fe34c514e0bc0e8/shap/plots/_violin.py#LL61C7-L61C7

see here :

https://github.com/dsgibbons/shap/blob/0821d2e1ea073562da7763236fe34c514e0bc0e8/shap/plots/_violin.py#LL61C7-L61C7

but the base_values can do the job.

2) However, after this change, i get a matplotlib warning regarding unused vmin / vmax ( i don't know what their purpose is, as there is no "c" parameter needed here, i feel ):

https://github.com/dsgibbons/shap/blob/0821d2e1ea073562da7763236fe34c514e0bc0e8/shap/plots/_violin.py#L335

removing them seem to be okay, but i don't want to presume.

3) After these changes however, it is also no longer possible to set colors from the top parameter. It may not have been possible from the start with Explanation objects. At this point, i don't want to touch too much as you seem to have a much clearer view of what you want to do with these violin plots.

My proposal would be to correct the other points you reviewed in this PR and to leave the passing of numpy arrays for now in the doc. Then, when you definitely deprecate them, or if i can get a grasp on how to make Explanation objects work, we can change the doc. Would that be ok ?

thatlittleboy commented 1 year ago

Hi @adnene-guessoum, apologies for making you go down this rabbit hole. Sure, let's leave it as numpy arrays for now, perhaps the support for Explanation in shap.plots.violin is not complete yet. I'll follow up on this later on.

adnene-guessoum commented 1 year ago

so i made the changes, but i see there are some movements to an organisation repo if i got it right. should i close this and retry when the merge is done ?

thatlittleboy commented 1 year ago

Nope, we're good. We can merge this here, and we'll take care of porting this change over to the main repo.

Thanks for the contribution @adnene-guessoum !

adnene-guessoum commented 1 year ago

Thank you for the helpful review and the welcoming of new contributors.