Comment(s): Great job, your code runs without any errors.
Criteria 2: Exploration of Data
Score Level: 4 (Exceeds expectations)
Comment(s): Nice job with your data exploration and looking at the distributions of different variables. You did an especially good job thinking about why certain distributions are the way that they are (e.g., income by age).
Criteria 3: Machine Learning Techniques used correctly
Score Level: 2 (Approaches expectations)
Comment(s): You had a lot of research questions – definitely more than you needed! Here are a couple of comments I have:
– When predicting sex, it's better to use logistic regression rather than linear regression, since it is a binary outcome.
– Whenever you say k means, you mean k nearest neighbors. K means is an unsupervised learning clustering algorithm.
– Income should be predicted using a regression algorithm, since it is a continuous outcome, not a classification algorithm (e.g., k nearest neighbors). Your income classification models likely performed well because most values are -1.
Criteria 4: Report: Are conclusions clear and supported by data?
Score Level: 4 (Exceeds expectations)
Comment(s): Great job drawing conclusions based on your results, as well as discussing possible factors that contributed to your results.
Criteria 5: Code formatting
Score Level: 4 (Exceeds expectations)
Comment(s): Good, your code in nicely formatted. Great job using comments to break your code up into sections.
Rubric Score
Criteria 1: Valid Python Code
Criteria 2: Exploration of Data
Criteria 3: Machine Learning Techniques used correctly
Criteria 4: Report: Are conclusions clear and supported by data?
Criteria 5: Code formatting
Overall Score: 18/20
Good work!