Comment(s): Great job, your code runs without any errors.
Criteria 2: Exploration of Data
Score Level: 4 (Exceeds expectations)
Comment(s): Very nice job exploring your data, especially looking at correlations between variables. Good job removing null values and noticing that most income values were -1 (unreported) and creating a new column to code that information. Also, your map of the location data was nicely done, it was interesting to see where the data came from!
Criteria 3: Machine Learning Techniques used correctly
Score Level: 2 (Approaches expectations)
Comment(s): In general, you used machine learning techniques correctly, but linear regression was not well-suited for your research question (as you noted). Instead of running a linear regression model anyway, it would have been better if you had come up with a research question that was well-suited for linear regression, or at least have a more in-depth discussion of why linear regression is not the right algorithm to use for this research question. Also, when you interpreted your logistic regression results, did you use a specific cutoff value (e.g., 0.5) for determining if a predicted data point should be classified as 0 or 1?
Criteria 4: Report: Are conclusions clear and supported by data?
Score Level: 2 (Approaches expectations)
Comment(s): Good job stating your research question, explaining why you were interested in it, and explaining its shortcomings. In general, you could have gone into greater detail with the discussion of your results. For example, how should the person reading your presentation interpret the ROC curve, and why is the logistic regression one the best? Also, you only presented an ROC curve for your KNN model, but you should include additional performance measures like F1 score. It would also be better if you did some more model comparison, or at least explain why two models cannot be compared.
Criteria 5: Code formatting
Score Level: 4 (Exceeds expectations)
Comment(s): You code is nicely formatted, good job using Jupyter notebooks to organize it, and including blocks of text to explain what your code is doing.
Overall Score: 16/20
Good job overall, especially with your exploration of data and thinking about different ways to look at the data. In the future, your explanation and discussion of your models should be expanded, including a short discussion of the algorithms and how to interpret the results.
Rubric Score
Criteria 1: Valid Python Code
Criteria 2: Exploration of Data
Criteria 3: Machine Learning Techniques used correctly
Criteria 4: Report: Are conclusions clear and supported by data?
Criteria 5: Code formatting
Overall Score: 16/20
Good job overall, especially with your exploration of data and thinking about different ways to look at the data. In the future, your explanation and discussion of your models should be expanded, including a short discussion of the algorithms and how to interpret the results.