Model shooting as a generative process

huffyhenry / shot-generation

Bayesian modelling of shot generation and conversion in soccer

GNU Lesser General Public License v3.0

10 stars 0 forks source link

Shooting is best modelled as a two-step process, where shot quality is first sampled from an appropriate distribution, and conversion is then a Bernoulli experiment. The (unnormalized) likelihood of a datapoint then becomes the definite integral over (0,1) of xi*pdf(xi | theta)dxi for a goal and (1-xi)*pdf(xi | theta)dxi for a miss, where pdf is the probability density function of the distribution of shot qualities, and theta are the parameters of that distribution, which can depend on any factors of interest, especially team identities and game state.

Cf. #6.

library(dplyr) library(MASS) expoL <- function(vec){ return(logLik(fitdistr(vec, "exponential"))) } betaL <- function(vec){ return(logLik(fitdistr(vec, "beta", start=list(shape1=0.5, shape2=0.5)))) } read.csv("../sbs-xg-review/data/sb.csv") %>% filter(competition_name == "Premier League") %>% filter(shot_set_play != "penalty") %>% group_by(team_name) %>% summarize( expo=expoL(shot_xg), beta=betaL(shot_xg) ) %>% mutate(choice=ifelse(beta > expo, "beta", "expo")) yields

huffyhenry / shot-generation

Model shooting as a generative process #27