Open mrcaseb opened 10 months ago
Yep this makes more sense
Thought about this for a while and there are several design decisions to be made.
calculate_stats
. This function should output everything that the three currently implemented player stats functions returnrlang::data_syms()
My current thought process is something like this :
calculate_stats <- function(seasons = nflreadr::most_recent_season(),
summary_level = c("season", "week"),
stat_type = c("player", "team")){
pbp <- nflreadr::load_pbp(seasons = seasons)
playstats <- # a load_pbp pendant for playstats from https://github.com/nflverse/nflverse-pbp/releases/download/playstats/play_stats_{season}.rds
# set grouping variables based off summary_level and stat_type
#
# sumarise epa stats and dakota using pbp
#
# summarise all other stats using playstats. That's a big call to summarise
# where we create all sorts of stats with the various stat IDs
#
# load player data if stat_type is player to joing player info
#
# join everything
}
This all sounds logical to me!
We see new issues in #444 and already had lots of problems caused by the fact that we summarize play stats into a tidy form.
I think for player stats, we should make a transition to a new concept.
We load raw game data, extract the play stats and row bind them. This will make it pretty easy and straightforward to correctly summarize player stats and team stats.