Best way to approach predicting Nick Chubb's rushing yards

reidsheppard / Python

0 stars 0 forks source link

Best way to approach predicting Nick Chubb's rushing yards #1

Open A143865 opened 2 years ago

A143865 commented 2 years ago

Per email correspondence:

using sk-learn
- linear regression
test data and training data
- 6 variables as input: def, def id, def madden rating, chubb madden rating, attempts, broken tackles
- 64 games

Some data questions

sportygavin commented 2 years ago

The madden ratings are updated weekly for Nick Chubb, and the defenses are updated every season. So I changed the Nick chubb ratings based on what they are before each game and the defense every season.
For the training data, I used Nick Chubb's games in his career besides this current. For the test data, I used all of Nick Chub's career games plus I am trying to update it every week for the games this season.
The Def measures the defensive rushing yards per game, so how many rushing yards the opponent gives up per game. Def id is just saying what team Nick Chubb is playing against, they are alphabetically 1-32.
Attempts and broken takes are what it is in the game he played. I'll change those to either the game before or the average from the current season.

A143865 commented 2 years ago

Sounds reasonable on the whole and I like the change for attempts and broken tackles. We might still have an issue with the training vs test data. If we use the same data for training and testing out model we are likely to have a model that appears to be more confident or accurate than it truly is. Perhaps the better split is to keep the training data as all games played except the current season and then the test data could be just the current season.

sportygavin commented 2 years ago

Ok, that sounds good and makes sense. I'll change the test data to just this season. Will the games from this season be a big enough dataset to test or should we add more to the test data?

On another note, I was thinking of comparing the actual data, and the predicted data to another prediction model like ESPN's fantasy on some graph as a way of showing the model's accuracy (or inaccuracy). Are there some other charts, graphs, or statistics that I should include to show the data?

A143865 commented 1 year ago

I like the idea of looking at our metrics, how we will determine if our model is good or needs improving, next. Lets use issue #2 to start that discussion.