Comment(s): As you know, your code runs with errors (specifically in the last cell). A quick google search of your error led me to this link: https://stackoverflow.com/questions/45890328/sklearn-metrics-for-multiclass-classification. Changing your code to print(precision_score(y4_test, sv_predict, average=None)) gets rid of the error, but still give you a warning. In general, it can be very useful to copy and paste the exact error message into google when you're trying to figure out how to fix it.
Criteria 2: Exploration of Data
Score Level: 2 (Approaches expectations)
Comment(s): You explored the data briefly, but your data exploration should ideally inform your research questions to a greater extent. You did do some investigation into one of your research questions before doing further analysis (age predicted from income) -- it's good that you made a plot to explore that, but from your plot, it looks like there is not really a relationship between income and age. It would be better if you made more plots looking a the relationships between features you're interested in, and follow up on those questions if it seems like there's an interesting relationship there.
Criteria 3: Machine Learning Techniques used correctly
Score Level: 3 (Meets expectations)
Comment(s): In general your algorithms are used correctly (regression for predicting continuous variables and classification for predicting categorical variables), but the analysis of your results needs some work. It's important to analyze classification results beyond just looking at accuracy, because other measures like F1 score are much more informative when it comes to model performance. Also, you should look at R^2 for your regression models.
Criteria 4: Report: Are conclusions clear and supported by data?
Score Level: 2 (Approaches expectations)
Comment(s): You did state your research questions, but it was not clear how your results supported or did not support those questions. For example, you say "We predicted income from multiple features using Multiple Linear Regression" but you do not explain how well your model was able to predict income. In reality, it did not do a very good job, which is okay, but you do have to discuss it. Also, the plots you made of your results were not informative, for example, for the Multiple Linear Regression results, it would have been better if you had made scatter plots instead of line plots. Line plots are generally used to look at how a variable changes over time, while scatter plots are used to look at the relationship between variables.
Criteria 5: Code formatting
Score Level: 4 (Exceeds expectations)
Comment(s): Good job, your code was formatted nicely in a Jupyter notebook.
Rubric Score
Criteria 1: Valid Python Code
print(precision_score(y4_test, sv_predict, average=None))
gets rid of the error, but still give you a warning. In general, it can be very useful to copy and paste the exact error message into google when you're trying to figure out how to fix it.Criteria 2: Exploration of Data
Criteria 3: Machine Learning Techniques used correctly
Criteria 4: Report: Are conclusions clear and supported by data?
Criteria 5: Code formatting
Overall Score: 12/20