nflverse / nflverse-pbp

builds play by play and player stats for nflverse/nflverse-data
Creative Commons Attribution 4.0 International
290 stars 66 forks source link

something odd with WPA #36

Open ak47twq opened 4 years ago

ak47twq commented 4 years ago

I compared the diff of two plays' home_wp_post and WPA in the database. Is WPA suppose to be the diff of two plays' home_wp_post? Most numbers check out, but some numbers dont make sense.

Why timeOUT has a different home_wp_post?

Here is what i do:

test<-pbp %>%
         filter(game_id == "2009_18_GB_ARI",! %>%
         select(game_id,play_id, qtr, desc, total, spread_line, home_wp_post, wpa) %>%

test <- test %>%
     mutate(wp_diff1 = abs(wpa))

test[1,'wp_diff2'] = 0

rownum <- nrow(test)

for (i in 2:rownum){



mrcaseb commented 4 years ago

Here is some more efficient code to reproduce this

pbp %>%
  filter(game_id == "2009_18_GB_ARI", ! %>%
  select(game_id, play_id, play_type, desc, home_team, posteam, wp, home_wp, wpa, home_wp_post) %>%
    wp_diff1 = abs(wpa),
    wp_diff2 = abs(home_wp_post - lag(home_wp_post))
  ) %>%
  filter(wp_diff2 != wp_diff1)


# A tibble: 4 x 12
  game_id   play_id play_type desc                                home_team posteam    wp home_wp      wpa home_wp_post wp_diff1 wp_diff2
  <chr>       <dbl> <chr>     <chr>                               <chr>     <chr>   <dbl>   <dbl>    <dbl>        <dbl>    <dbl>    <dbl>
1 2009_18_~    1416 no_play   (7:42) J.Kuhn right tackle to ARI ~ ARI       GB      0.153   0.847  0.00151        0.847  0.00151  0      
2 2009_18_~    1437 run       (7:02) A.Rodgers up the middle for~ ARI       GB      0.155   0.845 -0.00730        0.852  0.00730  0.00580
3 2009_18_~    4108 no_play   Timeout #1 by ARI at 01:46.         ARI       GB      0.639   0.361  0              0.361  0        0.278  
4 2009_18_~    4125 pass      (1:46) (Shotgun) K.Warner pass sho~ ARI       ARI     0.639   0.639  0.0216         0.661  0.0216   0.300  

home_wp_post of the play 1416 is modified in this line where home_wp_post is set to the previous value if the current play and the previous play are "no_play"s

The 4108 play appears to have switched home_wp and away_wp.

Any insights @guga31bb ?

guga31bb commented 4 years ago

This is the equivalent part in nflscrapR and I guess we must have modified it at some point, though I can't remember why. I personally have never used home_wp_post or WPA so I'm surprised we bothered to modify nflscrapR here- there must have been some bug addressed at some point?

mrcaseb commented 4 years ago

finally found the commit but it's not really informative lol

It's line 766-769 in that commit

guga31bb commented 4 years ago

That commit was mostly me just copy and pasting nflscrapR's part. But it's weird because it doesn't look identical to nflscrapR in that section