sta199-s23-2 / project-r-s2dio

https://sta199-s23-2.github.io/project-r-s2dio/
0 stars 0 forks source link

Peer review #5

Open ethanyc04 opened 1 year ago

ethanyc04 commented 1 year ago

Determining the factors that best predict overall regular season success in the 2013-2019 seasons of Division 1 college basketball.

The dataset was sourced from Division I college basketball seasons from the years of 2013-2019 from a website called Kaggle. The dataset is a compilation of various player data in division 1 college basketball such as the university each player plays for, their conference, number of games played/won/lost, etc.

One of the first approaches utilized by this team was to create a plot using the geom_point argument. This visualization allowed for the seeds of different college basketball teams to be plotted against the adjusted offensive efficiency. They will use a linear regression model for their data set. They created three different models in order to predict seeds based on 1) “adjusted offensive efficiency” 2) “adjusted defensive efficiency” and 3) “adjusted offensive and defensive efficiency.” The third model uses an interactive multiple linear regression model. Then, the group used adjusted r-squared values in order to determine which model was the best model with high correlation between the variables based on the highest adjusted r-squared value.

More context could be provided on the variables that are used in the analysis; it seems from the data and visualizations that a smaller value of adjusted defensive efficiency is a positive thing, but this could be made more clear.

It would be helpful if there was more description of why this team chose their methods of linear regression. It would also be helpful if there was some exploratory data analysis showing if offense and defense are correlated. I also think it would be a good idea to describe how the results are meaningful, like saying which component of team performance is most important for achieving a better seed. Also might have been nice to see a visualization of the linear regression so that the correlation in the data is clear.

It would be helpful for the group to mention the results calculated in the conclusion to support the claim they are making. Talking about making sense of the result of the data analysis, it would be nice for them to mention how their result compares with their hypothesis as well.

For this project, I would be interested to see the different models highlighted in their presentation because it provides an interesting way to think about seeding within college basketball.

Report rendered without any issues. Project proposal did not render because the object 'Squirrels' was not found. Likely due to a deleted dataset or something similar.

Overall the code was well organized, however a tidy() function in the Seed models might make the output more readable. Additionally, a title on the first graph included might make it more clear what the graph is showing.

The report is also somewhat hard to follow, since there seems to be multiple research questions and hypotheses, but only one of them is included in the results. If the other research question isn’t being addressed, it should probably be removed to make the report cleaner.

This team did a great job highlighting the importance and the significance of this research project and how it relates to a personal level, making the report engaging to read. Adding the personal aspect could be helpful in our project as well.

The team should add a literature review to their final project report in order to provide more information about past research which has been conducted about college basketball seeds and how those are affected by offensive and defensive efficiency.

It might be cool to further explore how other variables like shooting percentages/tendencies or wins compare in predicting seed-lines in addition to just offensive and defensive efficiency because it could help to understand what factors drive the best college basketball teams.

lsanchez1402 commented 1 year ago

fixed load csv error