Issues with predictions

bufferapp / convertice

Trial Conversion Forecasting Model

MIT License

0 stars 0 forks source link

Issues with predictions #7

Closed davidgasquez closed 4 years ago

davidgasquez commented 4 years ago

Hey there! I'm starting this issue to gather a list of issues around the data and the predictions we're doing with Convertice.

We're grabbing customers that score more than 0.8 with the current model. We care about the predictions in these places:

Customer.io Segment. Since we're sending only the event, we don't have emails for some of these customers. There are 76 customers at this time.
Mixpanel Report. We also have 76 profiles at this time. Feel free to open and check any profile! It seems quite a lot of the predictions at this range (0.8, 1) are affected by some issue.

Exploring Mixpanel, these seem to be the biggest issues:

Customer starts a trial but then downgrades before finishing or doesn't pay the subscription.
- 5d5fa6fcab53874d281a6737. Looking at Stripe, seems this customer has a few failed payments. Should it count as a conversion since the subscription is already there or should we wait until payment. In case we go with the second option, the model is working properly.
- 5d763d885528736c7f0bd946. Similar but in this case, the customer paid the invoice on Oct 23.
- 5ca1d92569ec79395e8549a0. Stripe.
Customer do a trial while using the product with other subscriptions.
- 558947cd9f509ba7296f795f. The trial is still going on (Stripe) but since it was previously in another plan, the model might be not as helpful.
Customers have already paid and subscribed.
- 5ca1da0983663e38f39fde56. Looking at Stripe, it should have been marked as converted.
- 5ca1d88669ec79395e83b2db. Stripe.

These are only a few examples of the entire Mixpanel report. I think some of these issues might be easy to solve with a few filters. That said, I can see how some of these might be really tricky to fix without using Segment data and that will slow down the delivery.

Would love to hear your thoughts @michael-erasmus!

jwinternheimer commented 4 years ago

Hey @davidgasquez, thanks so much for documenting these issues! 💥

Issue 1: Customer downgrades or has failed payment

5d5fa6fcab53874d281a6737: This is a weird case. In Stripe it looks like this customer converted. He's currently on a small business plan. Indeed, in the dbt_buffer.stripe_trials model, this customer is considered converted. I'm not sure why converted is FALSE in the predict_publish_trial_conversion_prediction table. 😕
- 5d763d885528736c7f0bd946: Similar to the last user. This user is converted in the stripe_trials model, however converted = FALSE in the predict_publish_trial_conversion_prediction. Do you know why that might be?

Issue 2: Customers starting multiple trials for multiple plans and products

558947cd9f509ba7296f795f: This user started a Pro trial, a Business trial, and an Analyze trial on the same day. She's been very active in the Publish trial. What is the problem here again?

Issue 3: Customers have already paid and subscribed

5ca1da0983663e38f39fde56: This is the same user that is addressed in issue 1. The user is converted in the stripe_trials model but not in the prediction dataset for some reason.
5ca1d88669ec79395e83b2db: This user is also converted in the stripe_trials model. Maybe they converted after the prediction was made? In that case I think the model worked properly. What do you think?

Thanks again @davidgasquez! It was super helpful to go through these cases. Let me know what you think we should do to improve the model and predictions! I still think we should create segments like very likely, likely, not likely, etc 😄

davidgasquez commented 4 years ago

Heya @jwinternheimer! Thanks so much for taking a look at the issues. :raised_hands:

This user is converted in the stripe_trials model, however converted = FALSE in the predict_publish_trial_conversion_prediction. Do you know why that might be?

I think I know what's going on! Digging into the dbt query, the converted column is from the trials models and we're not modifying it. That said, we're using a materialized table and we need to run the predictions right after the dbt model has updates that base table. :zap:

They all might be related with the same source issue then. Tomorrow, I'll run the dbt model and then, the predictions. This way we'll have up to date converted values.

Regarding the Issue 2, the main issue, which I'm now thinking it's not a big deal is that users that starts two trial close to each other will have similar predictions for each one. If the user starts trialing a plan and 1 week later another plan, we'll be using the resulting features of trialing two plans (more posts, profiles, ...). As I said, I think they shouldn't be a lot and we're still sending two prediction events as we can see with 5ca1d5de7ace10391ae5dc5c.

I still think we should create segments like very likely, likely, not likely, etc smile

I think we can start creating these segments in Mixpanel now! Or were you talking about sending the segment with the prediction?

Thanks again for taking the time to explore and dig with me. It was really helpful. I'll check back tomorrow with the new predictions and if I spot some oddities, I'll post a new list!

davidgasquez commented 4 years ago

Just run the dbt model and the AutoML model! We've got a new set of customer predictions (Mixpanel report). These are the couple of odd things I've noticed:

Customer manually cancel the subscription after trial end. For example, customer 5c79848506b2c0135023ac42 seems to have cancelled the subscription manually. What should we do with these predictions? Something similar happening with 5ca1d8691d99bc38dbfad63e. Seems that the subscription will be cancelled soon.
The trial ends but the following subscription has past due payments. Customer 5ca1d3621d99bc38dbf0aea2 has a few past due payments. Same happening with a few others like 5ca1d3621d99bc38dbf0aea2 (Stripe).

That said, with this batch I feel much better on the model outcome. There are mostly great predictions! Most of the issues seems to be related with thew way we handle Past Due and Cancelled Subscriptions. If you want to take a peek at the Mixpanel report to check a few random predictions and see if I'm missing something it would be awesome!

jwinternheimer commented 4 years ago

💥 Nice! Thanks @davidgasquez, I'm taking a look now. 👀

Customer manually cancel the subscription after trial end: That customers trials are not considered converted in the stripe_trials model. If the subscription was cancelled after the prediction was made, I don't think we should have to go back and change the prediction. I'm not sure if I fully understand the issue 😅
The trial ends but the following subscription has past due payments: similarly for this user, the trials aren't considered converted in the stripe_trials model or the prediction dataset. I'm not exactly sure what the issue is here either. This user made use of the trial and had an existing payment option, but unfortunately the payments failed 😢

Thanks again for being so thorough and sharing the examples @davidgasquez! They're super informative and fun to go through! Happy to look at any more you come across

davidgasquez commented 4 years ago

Thanks again @jwinternheimer!

That customers trials are not considered converted in the stripe_trials model. If the subscription was cancelled after the prediction was made, I don't think we should have to go back and change the prediction. I'm not sure if I fully understand the issue

Right! Although the customer cancelled the subscription before us doing the predictions. That might be solved filtering for the users that cancelled subscriptions after trial end.

I'm not exactly sure what the issue is here either. This user made use of the trial and had an existing payment option, but unfortunately the payments failed cry

The issue there is not with any model but more in the lines of: Are we reaching out to people that have past pue payments?

I think we're starting to move from technical issues to logical issues when applying the model and that's good. :smile:

jwinternheimer commented 4 years ago

Although the customer cancelled the subscription before us doing the predictions.

Ah ok, I see! Yeah, we could make a tweak to the training data to check the subscription's current status and make sure that it's "trialing". That would exclude users that convert early as well. :)

Are we reaching out to people that have past pue payments?

Oh I see. That's a great question. I'm not sure if we are, but we definitely should!

jwinternheimer commented 4 years ago

I think this should help a little 😄 https://github.com/bufferapp/buffer-dbt/pull/54