MSRConnections / Azure-training-course

Other
111 stars 90 forks source link

Spark lab: Exercise 5: add explanation for ML task #169

Open vanitech opened 8 years ago

vanitech commented 8 years ago

what problem is the lab trying to solve? not clear why ML needs to be used.

jeffprosise commented 8 years ago

Not sure if you saw the text we added to the intro of Exercise 5. But I modified that further to make the case as to WHY ML is an asset here:

"In the previous exercise, you interactively explored a set of food-inspection data and obtained key insights by looking at it in different ways. However, sometimes the sheer volume and complexity of the data makes relationships difficult to identify. One solution is machine learning, a technique that algorithmically finds patterns in data and exploits those patterns to perform predictive analytics."

"Your Azure HDInsight Spark cluster includes several libraries from which you build sophisticated machine-learning models. In this exercise, you will use some of these tools to build, train, and score a machine-learning model using the food-inspection data featured in the previous exercise. In that model, you will use a popular classification algorithm to predict which restaurants will be successful and which ones won't based on certain features of the input data — information that is difficult to discern simply by examining the data."

jeffprosise commented 8 years ago

Not sure what to add here. A paragraph explaining what ML is? Seems to me that we state the case for why ML is being used (and even describe in one sentence what it is). What am I missing?