Summary

Analyzed if a good player would correlate with being on a good team. However, he found that Win shares per 40 minutes were good predictors however with a small r^2 , he couldn't have accurate predictions. And the next steps would be to collect more data such as total points for each to help scouts of the NBA to determine if players have the potential for the NBA.

Data Preparation

The tables hold information on college basketball players from D1 schools. Also of the top 25 ranked schools. From the second deliverable, we can see that the tables are tidy. From the visualizations, we can see the averages of players' points for each team. However, on the second deliverable were it shows the top 25 Team Ranked Scoring, the visual could be a little more organized, I didn't understand what the x coordinate was of and if having a higher or lower x was good.

Modeling

First model- wanted to see if the team rank could be predicted by Minutes played, points, total rebounds, assists, steals, and field goal average. A good assessment was also made as there's barely any correlations with these, except for total rebounds as the r^2 was too small. Second model - tried to predict team rank by total rebounds and win shares. also, r^2 was low so predicted team ranks based on these results wouldn't be the best idea. Third model- tried to predict a college's rank based on minutes played, points, total rebounds, assists, steals, and field goal average. The r^2 was low so the correlations weren't strong.

I think the project did describe the purpose of the models and also accurately interpret the summary but it could be more detailed and explain why these results might happen.

Validation

Yes, and explained the cross-validations appropriately and clearly. He talked about RMSE and also MAE but could add some explanations about why these specific results occurred.

R Proficiency

R code was pretty clear to understand, however, a short description in the comments would have helped on functions that are not as used as much. They used appropriate techniques as well.

Communication

I think there could be a little bit more description of what your graphs are trying to show. Also about what variables you are using. Also, your models could have been explained a little clearer for those who are not as technical. The last two graphs in deliverable 1 were nice visuals but they could have explained more on why they choose to make them, like why would they help in explaining the data more.

Critical Thinking

Does the operationalization and social impact demonstrate careful, critical thought about the future of the project? What are possible unintended consequences or variables that the author has not discussed?

I think you could explain more why the ranks are not a good predictor but also the implications of scouting based on just points. If we make decisions based just on statistics, we start to ignore a player's ability to be a good teammate and a good attitude.

Data Preparation and Modeling (18 out of 20%)

I believe I prepared the data well, after obtaining the data tidied it well and had it ready for modeling.

Validation and Operationalization (16 out of 20%)

I had good validation methods however because of my data they weren't able to yield predicable results.

R Proficiency (19 out of 20%)

My R proficiency was well, the way I used it, the way it worked with my datasets, visualizations and models the R code proved very effective. I believe my styling of R code was also very well. I had the appropriate spaces, tabs and indents in every section.

Communication (15 out of 20%)

My communication was good but it was hard to explain why my models didn't produce useful data. I could have gone more in-dept with my models and explained them a little more in detail.

Critical Thinking (14 out of 20%)

My critical thinking was good for most of the portfolio, however it could have been made more clear during my models and validation. Critical thinking got harder after my models and validation because they were producing data that was hard to explain.

introdsci / DataScience-sroes

Final Review #6