nflverse / nflfastR

A Set of Functions to Efficiently Scrape NFL Play by Play Data
https://www.nflfastr.com/
Other
425 stars 52 forks source link

Series not resetting on punt turnver #144

Closed ajreinhard closed 3 years ago

ajreinhard commented 3 years ago

I noticed that it looks like the series field is not resetting when there is a turnover on punt. I provided an example below:

library(tidyverse)

pbp_df <- readRDS(url(paste0('https://raw.githubusercontent.com/guga31bb/nflfastR-data/master/data/play_by_play_2020.rds')))

# Giants recover muffed punt, but play_id = 272 does not become series = 4
pbp_df %>% 
  filter(game_id == '2020_01_PIT_NYG' & series == 3) %>% 
  select(play_id, desc)

# some other examples of series that might also include 2pt attempts or don't reset on punts
pbp_df %>% 
  filter(play_type != 'no_play') %>%
  group_by(game_id, series) %>% 
  summarise(
    plays = sum(play),
    turnover = sum(fumble_lost + interception),
    penalties = sum(penalty),
    two_pt_att = sum(two_point_attempt),
    .groups = 'drop'
  ) %>% 
  arrange(-plays)

EDIT: read through some of issue #130 and it looks like this might be referenced at some point. Disregard if needed, sorry!

guga31bb commented 3 years ago

I was actually curious about this and never looked into it- is there some "official" definition of drive and series and if so, do they re-start after a muffed punt?

ajreinhard commented 3 years ago

It looks like there actually is!

This situation is referenced explicitly in Rule 7 of the NFL rulebook (see 7-3-1-d). It looks like a scrimmage kick (punt) that has been recovered by the kicking team marks a new series. This is consistent with 3-9-2, which refers to a series as "four consecutive charged scrimmage downs" (3-9-4). A down which includes a "change of possession" is not a charged down.

Changes of possession are mentioned in 3-36-2. Teams can switch between offense and defense mid-down, but do not change from "Team A" (team that put the ball in play) and "Team B".

My interpretation of a drive would be the consecutive scrimmage plays in which a team is "Team A". Drives are not otherwise mentioned.

In the case of the Giants-Steelers game above, the Giants would be starting a new series of downs, but I'd argue that it is not a new drive. All that being said though, I would say I'm more interested in making the fields make sense for data analysis rather than following the rulebook to the letter.

guga31bb commented 3 years ago

Thank you this is extremely helpful and provides more motivation to actually fix it!