introdsci / DataScience-th3n1ghtmanc0m3th

DataScience-th3n1ghtmanc0m3th created by GitHub Classroom
0 stars 0 forks source link

Final Review #1

Open Dragongoat opened 4 years ago

Dragongoat commented 4 years ago
# Summary 

This project investigated a possible correlation between education rates and violence in the US. The final insights were mostly inconclusive due to a lack of strong data, so the next step would be to gather more pieces of relevant data to support those already obtained.

# Data Preparation 

This project split data up into tables corresponding to each state in the US. These tables hold information regarding the number and percentage of education at different levels, as well as rates of different types of violent crime. The portfolio presents this data cleanly and follows tidyness standards appropriately.

# Modeling 

The predictive model built in this project contained the percentage of diplomas in a population predicting the percentage of violence in that same population. The intent of the portfolio was to draw a correlation between these two variables, so this is a correct model for the situation. An analysis of the model's summary correctly concludes that the model is not a statistically significant predictor.

# Validation 

The model has been cross validated, and the accuracy of the cross validation was interpreted correctly.

# R Proficiency 

While verbose at times, the R code effectively accomplishes the task at hand using functional programming techniques. The code is replicable and can be reused to come to similar conclusions.

# Communication 

The wording is good at demonstrating the point to someone not specialized in the subject being examined. Thew visualizations are excellent methods of conveying the trends of the first dataset. One possible improvement would be to add visualizations for the second dataset to observe trends in violence data.

# Critical Thinking 

The operationalization step carefully summarizes the results and analyzes what could be done to improve the process in subsequent iterations. Possible consequences are addressed properly and shortcomings of the data are acknowledged.

th3n1ghtmanc0m3th commented 4 years ago

Data Preparation and Modeling (20% out of 20%)

I spent a good deal of time preparing and refining the data, it definitely is about 80% of the work in a data science project. I believe I followed the principles of tidy data well.

Validation and Operationalization (17% out of 20%)

Although my model did not end being very accurate at prediction, I do believe I built a good model and accurately interpreted the results. But I probably would take a few points off for not having a very good visualization of that model.

R Proficiency (17% out of 20%)

With the exception of my code for the last visualization graph, I believe I showed a fairly strong proficiency in R and it's concepts.

Communication (20% out of 20%)

While slightly verbose in some areas, my goal was to be as detailed as possible to explain what was happening to a person who had no experience with R. Not only do I believe I explained that well, but I believe I thoroughly explained what my project was doing and what was accomplished.

Critical Thinking (20% out of 20%)

I chose this topic specifically because I have thought so much about it. I feel that even though my model was inconclusive I still put great thought into how it could be improved with additional datasets gathered and what effects those might have.