nflverse / nflfastR

A Set of Functions to Efficiently Scrape NFL Play by Play Data
https://www.nflfastr.com/
Other
425 stars 52 forks source link

calculate_player_stats_def() def_fumble_recovery_own bug #377

Open TheMathNinja opened 2 years ago

TheMathNinja commented 2 years ago
  1. Have you installed the latest development version of the package(s) in question?

    Yes

  2. Describe the bug

    calculate_player_stats_def() returns the wrong value in def_fumble_recovery_own variable.

  3. Reprex

wilkins_bug <- nflfastR::calculate_player_stats_def(nflfastR::load_pbp(2021), weekly = TRUE) %>%
  filter(player_display_name == "Christian Wilkins") %>%
  filter(week == 11)
  1. Expected Behaviour Wilkins is correctly credited with 1 def_fumble_recovery_opp and 9 def_fumble_recovery_yards_opp and 1 def_fumbles but is incorrectly credited with 1 def_fumble_recovery_own and 9 def_fumble_recovery_yards_own (both should be 0).

  2. nflverse_sitrep()

sitrep ```r ── System Info ─────────────────────────────────────────────────────────────── • R version 4.2.1 (2022-06-23 ucrt) • Running under: Windows 10 x64 (build 22000) ── nflverse Packages ───────────────────────────────────────────────────────── • nflreadr (1.3.0.05) • nflseedR (1.1.0) • nflplotR (1.1.0) • nflfastR (4.4.0.9006) • nfl4th (1.0.2.9001) • nflverse (1.0.2) ── nflverse Options ────────────────────────────────────────────────────────── No options set for nflreadr, nflfastR, nflseedR, nfl4th, nflplotR, and nflverse ── nflverse Dependencies ───────────────────────────────────────────────────── • askpass (1.1) • gtable (0.3.1) • progressr (0.11.0) • bit (4.0.4) • hms (1.1.2) • proto (1.0.0) • bit64 (4.0.5) • httr (1.4.4) • purrr (0.3.4) • cachem (1.0.6) • isoband (0.2.5) • R6 (2.5.1) • cli (3.4.0) • janitor (2.1.0) • rappdirs (0.3.3) • clipr (0.8.0) • jsonlite (1.8.0) • RColorBrewer (1.1-3) • codetools (0.2-18) • labeling (0.4.2) • Rcpp (1.0.9) • colorspace (2.0-3) • lattice (0.20-45) • readr (2.1.2) • cpp11 (0.4.2) • lifecycle (1.0.2) • rlang (1.0.5) • crayon (1.5.1) • listenv (0.8.0) • rstudioapi (0.14) • curl (4.3.2) • lubridate (1.8.0) • scales (1.2.1) • data.table (1.14.2) • magick (2.7.3) • snakecase (0.11.0) • digest (0.6.29) • magrittr (2.0.3) • stringi (1.7.8) • dplyr (1.0.10) • MASS (7.3-57) • stringr (1.4.1) • ellipsis (0.3.2) • Matrix (1.5-1) • sys (3.4) • fansi (1.0.3) • memoise (2.0.1) • tibble (3.1.8) • farver (2.1.1) • mgcv (1.8-40) • tidyr (1.2.1) • fastmap (1.1.0) • mime (0.12) • tidyselect (1.1.2) • fastrmodels (1.0.2.9001) • munsell (0.5.0) • tzdb (0.3.0) • furrr (0.3.1) • nlme (3.1-157) • utf8 (1.2.2) • future (1.28.0) • openssl (2.0.3) • vctrs (0.4.1) • generics (0.1.3) • parallelly (1.32.1) • viridisLite (0.4.1) • ggplot2 (3.3.6) • pillar (1.8.1) • vroom (1.5.7) • globals (0.16.1) • pkgconfig (2.0.3) • withr (2.5.0) • glue (1.6.2) • prettyunits (1.1.1) • xgboost (1.6.0.1) • gsubfn (0.7) • progress (1.2.2) ────────────────────────────────────────────────────────────────────────────── ```
  1. Screenshots

  2. Additional context

The problem play is

play_id | game_id
1227 | 2021_11_MIA_NYJ

where Wilkins recovers the other team's fumble, fumbles the ball, and then his teammate (Jevon Holland) recovers his fumble. There was an own fumble recovery by the defense, but it is incorrectly credited to Wilkins (with yards) and also correctly credited to Holland (with no yards) on this play.

mrcaseb commented 1 year ago

This issue is related to the fact that we can't differentiate between recoveries after fumble of own team and opponent team with the current nflfastR dataset. The only solution is to differentiate own and opp in stat ids 55-58 and 59-62 respectively. grafik

To make that possible we would have to add potentially 8 new variables to the dataset and probably (not necessarily) rename all current fumble recovery variables.

Adding variables is problematic because

For now, we have decided to live with this blurriness. Next season, however, will be the first time that users can put their hands on defensive player stats. If there are any further problems or if more users encounter this problem, we will add the corresponding variables next offseason.