andrewwuan / PredictionIO-Churn-Prediction-H2O-Sparkling-Water

PredictionIO Engine integrated with Sparkling Water. Open Source project Spring 2015 @CMU.
12 stars 9 forks source link

Data - interpretation #4

Closed phiripatrick663 closed 9 years ago

phiripatrick663 commented 9 years ago

Hi Andrew, I have had some time to look through code and data, and need a little explaining to help me understand how to achieve expected result. I understand wanting to identify customers that may jump ship, but looking at this data I don't know how the code is supposed to achieve it. I don't see customers in the data file.

When I run the query.py I'm getting following output:

{u 'p': u '0.11381198094113941"}

and I can't explain what it means.

Can you share a bit more insight as to your intention and how the code achieves it? My interest is not in the churn application per say, but rather in the successful integration of H2O and PredictionIO. I would like to utilize the deep learning aspect of H2O.

Thanks. Patrick.

andrewwuan commented 9 years ago

@phiripatrick663 The intention of this template example is to use H2O deep learning algorithm to predict the rate of churning, given information of a customer, such as his device type, phone call minutes, etc.. If you look at the query, and compare it to the training data, you'll see that the query resembles one line in the training data. IOW, it's one customer's information. And the rate you saw (0.1138...), is saying that there's a 11.3% chance that this customer will churn.

phiripatrick663 commented 9 years ago

Thanks Andrew. So the higher the rate the better confidence we have in the prediction? If you wanted to tune it to see what weighs more in the factors what would you change?

On Sun, Oct 25, 2015, 2:05 PM An Wu notifications@github.com wrote:

Closed #4 https://github.com/andrewwuan/PredictionIO-Churn-Prediction-H2O-Sparkling-Water/issues/4 .

— Reply to this email directly or view it on GitHub https://github.com/andrewwuan/PredictionIO-Churn-Prediction-H2O-Sparkling-Water/issues/4#event-444935861 .

andrewwuan commented 9 years ago

@phiripatrick663 The higher rate, the more likely that the customer will churn. I'm not sure how to get the confidence of the prediction, but generally it should be that the larger the dataset is, the higher the confidence. To tune the algorithm, you can add parameters to DeepLearningParameters. You can to this javadoc for how to configuring DeepLearningParameters.

phiripatrick663 commented 9 years ago

Thanks for the response. I will do some reading and get in touch. I appreciate your help.

On Sun, Oct 25, 2015, 2:55 PM An Wu notifications@github.com wrote:

@phiripatrick663 https://github.com/phiripatrick663 The higher rate, the more likely that the customer will churn. I'm not sure how to get the confidence of the prediction, but generally it should be that the larger the dataset is, the higher the confidence. To tune the algorithm, you can add parameters to DeepLearningParameters https://github.com/andrewwuan/PredictionIO-Churn-Prediction-H2O-Sparkling-Water/blob/master/src/main/scala/Algorithm.scala#L28. You can to this javadoc http://s3.amazonaws.com/h2o-release/h2o/master/3183/docs-website/h2o-algos/javadoc/hex/deeplearning/DeepLearningParameters.html for how to configuring DeepLearningParameters.

— Reply to this email directly or view it on GitHub https://github.com/andrewwuan/PredictionIO-Churn-Prediction-H2O-Sparkling-Water/issues/4#issuecomment-150963579 .