h2oai / wave-apps

Sample AI Apps built with H2O Wave.
MIT License
144 stars 51 forks source link

feat: churn-risk redesign #48

Closed mturoci closed 3 years ago

mturoci commented 3 years ago

Churn risk app redesign

Previous:

https://user-images.githubusercontent.com/64769322/112501274-633e5f00-8d89-11eb-852e-e37ad2e240a0.mov

New:

https://user-images.githubusercontent.com/64769322/112501331-6fc2b780-8d89-11eb-90da-96196ec4d171.mov

This PR:

Thanks @mtanco for DS support!

Future work:

Open questions:

mtanco commented 3 years ago

Visually this looks beautiful! I haven't had a chance to go through code yet, but as an idea you could do the following on the "home page" before someone chooses a Phone Number ( I don't know that it is needed, but it is common to do Global vs. Local explanations):

  1. Shap Explanation - > Global Shap
    temp = contributions_df.mean(axis=0).to_numpy()
    shap = [(contributions_df.columns[i], temp[i]) for i in range(len(contributions_df.columns))]
    shap.sort(key=lambda e : e[1])
    display(shap)
  2. Churn Rate - > Average Churn Prediction
  3. Total Charges -> Average Total Charges
  4. Total Charges Breakdown -> Average Charges Breakdown
  5. and 6. Feature Contribution to Retention & Churn --> Global PDP
    col = choose min or max column from Global Shapley
    pdp = model.partial_plot(
    df, 
    cols=[col],
    plot=True,  # change to false, just for debugging
    nbins=20 if not df[col].isfactor()[0] else 1 + df[col].nlevels()[0],
    )
mturoci commented 3 years ago

Added a dark theme as well:

https://user-images.githubusercontent.com/64769322/112615508-da740180-8e22-11eb-840b-d9ee90c8870f.mov

geomodular commented 3 years ago

After adding the theme switcher, the input box stands out more. I think it's better.

mturoci commented 3 years ago

Thanks for the valuable review @geomodular! Comments addressed.

mturoci commented 3 years ago

Sounds good @mtanco! Will also add a global explanation.

mturoci commented 3 years ago

Comments addressed except of the last one (waiting for workaround).

mtanco commented 3 years ago

@mturoci I added in a check on the column's data type and made a histogram using the bins from the first plot if the data is numeric. This should fix the grouping problem caused by the original pseudo code I sent you :)

Before:

Screen Shot 2021-03-31 at 2 36 59 PM

After:

Screen Shot 2021-03-31 at 3 03 56 PM
mturoci commented 3 years ago

Thanks @mtanco!