Open A143865 opened 2 years ago
Sounds reasonable on the whole and I like the change for attempts and broken tackles. We might still have an issue with the training vs test data. If we use the same data for training and testing out model we are likely to have a model that appears to be more confident or accurate than it truly is. Perhaps the better split is to keep the training data as all games played except the current season and then the test data could be just the current season.
Ok, that sounds good and makes sense. I'll change the test data to just this season. Will the games from this season be a big enough dataset to test or should we add more to the test data?
On another note, I was thinking of comparing the actual data, and the predicted data to another prediction model like ESPN's fantasy on some graph as a way of showing the model's accuracy (or inaccuracy). Are there some other charts, graphs, or statistics that I should include to show the data?
I like the idea of looking at our metrics, how we will determine if our model is good or needs improving, next. Lets use issue #2 to start that discussion.
Per email correspondence:
Some data questions