loribard / date-a-scientist

0 stars 0 forks source link

MLF Capstone Feedback #1

Open mackenzieyoung opened 5 years ago

mackenzieyoung commented 5 years ago

Rubric Score

Criteria 1: Valid Python Code

Criteria 2: Exploration of Data

Criteria 3: Machine Learning Techniques used correctly

Criteria 4: Report: Are conclusions clear and supported by data?

Criteria 5: Code formatting

Overall Score: 18/20

loribard commented 5 years ago

THANKS! Great feedback. I want to spend some more time "playing" with the data and graphing. I learned the material much better the second and third time through it while working on this project. I feel like I've just touched on Machine Learning, but at least I've got a good overall general knowledge. I really appreciate you comments on using Machine Learning Techniques clearly; it makes a lot more sense now. Lori

Lori Bard lori.r.bard@gmail.com 650.867.2067

On Tue, Mar 5, 2019 at 1:41 PM Mackenzie Young notifications@github.com wrote:

Rubric Score Criteria 1: Valid Python Code

  • Score Level: 4 (Exceeds expectations)
  • Comment(s): Great job, your code runs without any errors.

Criteria 2: Exploration of Data

  • Score Level: 3 (Meets expectations)
  • Comment(s): Your data exploration is off to a good start. Good job looking at relationships between variables, like selfish words vs. income. In general, a couple of your graphs were difficult to interpret. Your second graph (distribution of drinks_code feature) should be a bar graph, since you're looking at the distribution of the frequency of a discrete variable. Line graphs are better for showing trends over time. Also, although scatter plots are useful for showing relationships between two variables, it's impossible to infer the density of the data in your selfish words vs. income plot. Adding some jitter to the points can help with this, and will make it look more like a cloud of points than a grid!

Criteria 3: Machine Learning Techniques used correctly

  • Score Level: 3 (Meets expectations)
  • Comment(s): Good job using machine learning techniques correctly. When you analyze your classification results, it's best to look at performance measures beyond accuracy, like precision or F1 score. Also, since regression models do best predicting continuous outcomes, it would be better to leave the income values as their raw values without converting them to values 1-5. When analyzing your regression model results, make sure you're specifically looking at the R^2 score (the proportion of the variance explained by the model), or make sure you specify that that's what you're looking at.

Criteria 4: Report: Are conclusions clear and supported by data?

  • Score Level: 4 (Exceeds expectations)
  • Comment(s): Great job, your research question is stated clearly, and your report is well-made. The results of your regression algorithms and classification algorithms are shown, and your conclusions are clearly stated and based on evidence. Good job thinking about ways to follow up with further analyses.

Criteria 5: Code formatting

  • Score Level: 4 (Exceeds expectations)
  • Comment(s): Nice job, your code is well formatted. Good job using Jupyter notebooks to format your code into smaller sections.

Overall Score: 18/20

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/loribard/date-a-scientist/issues/1, or mute the thread https://github.com/notifications/unsubscribe-auth/AOVFmRe1Prw7BEPQ8VmzGAFAdUm4Trsuks5vTuSJgaJpZM4bfjwk .