nflverse / nflverse-pbp

builds play by play and player stats for nflverse/nflverse-data
Creative Commons Attribution 4.0 International
286 stars 63 forks source link

[BUG] missing 3 games from participation datasets #72

Closed numbersinfigures closed 1 year ago

numbersinfigures commented 1 year ago

Is there an existing issue for this?

Have you installed the latest development version of the package(s) in question?

What version of the package do you have?

1.3.0.5

Describe the bug

I noticed some games are missing from the participation data (2020_09_PIT_DAL, 2021_06_GB_CHI, 2021_17_NYG_CHI), both for include_pbp=TRUE and include_pbp=FALSE outputs, after initiating the dev version of nflverse. Found after cross-checking with nflseedR::load_sharpe_games() and standalone pbp datasets.

Reprex

ppt <- load_participation(
   seasons = TRUE,
   include_pbp = TRUE)
ppt_pbp_chk <- ppt %>%
  filter(nflverse_game_id %in% c("2021_17_NYG_CHI","2021_06_GB_CHI","2020_09_PIT_DAL"))
ppt_pbp_chk
-- nflverse play-by-play participation ------------------
i Data updated: 2022-09-21 02:20:33 PDT
# A tibble: 0 x 383
# ... with 383 variables: nflverse_game_id <chr>,
#   play_id <int>, possession_team <chr>,
#   offense_formation <chr>, offense_personnel <chr>,
#   defenders_in_box <int>, defense_personnel <chr>,
#   number_of_pass_rushers <int>, players_on_play <chr>,
#   offense_players <chr>, defense_players <chr>,
#   n_offense <int>, n_defense <int>, ...

Expected Behavior

Participation data output for "2021_17_NYG_CHI","2021_06_GB_CHI","2020_09_PIT_DAL"

nflverse_sitrep

-- System Info ------------------------------------------
* R version 4.1.3 (2022-03-10)   * Running under: Windows 10 x64 (build 19043)
-- nflverse Packages ------------------------------------
* nflreadr (1.3.0.05)    * nflseedR (1.1.0)       * nflplotR (1.1.0)  
* nflfastR (4.4.0.9010)  * nfl4th   (1.0.2.9002)  * nflverse (1.0.1)  
-- nflverse Options -------------------------------------
No options set for nflreadr, nflfastR, nflseedR, nfl4th,
nflplotR, and nflverse
-- nflverse Dependencies --------------------------------
* askpass     (1.1)     * gtable      (0.3.0)    * progressr    (0.10.0)   
* bit         (4.0.4)   * hms         (1.1.1)    * proto        (1.0.0)    
* bit64       (4.0.5)   * httr        (1.4.2)    * purrr        (0.3.4)    
* cachem      (1.0.6)   * isoband     (0.2.5)    * R6           (2.5.1)    
* cli         (3.3.0)   * janitor     (2.1.0)    * rappdirs     (0.3.3)    
* clipr       (0.8.0)   * jsonlite    (1.8.0)    * RColorBrewer (1.1-3)    
* codetools   (0.2-18)  * labeling    (0.4.2)    * Rcpp         (1.0.8.3)  
* colorspace  (2.0-3)   * lattice     (0.20-45)  * readr        (2.1.2)    
* cpp11       (0.4.2)   * lifecycle   (1.0.1)    * rlang        (1.0.3)    
* crayon      (1.5.1)   * listenv     (0.8.0)    * rstudioapi   (0.13)     
* curl        (4.3.2)   * lubridate   (1.8.0)    * scales       (1.2.0)    
* data.table  (1.14.2)  * magick      (2.7.3)    * snakecase    (0.11.0)   
* digest      (0.6.29)  * magrittr    (2.0.3)    * stringi      (1.7.6)    
* dplyr       (1.0.9)   * MASS        (7.3-55)   * stringr      (1.4.0)    
* ellipsis    (0.3.2)   * Matrix      (1.4-0)    * sys          (3.4)      
* fansi       (1.0.3)   * memoise     (2.0.1)    * tibble       (3.1.7)    
* farver      (2.1.0)   * mgcv        (1.8-39)   * tidyr        (1.2.0)    
* fastmap     (1.1.0)   * mime        (0.12)     * tidyselect   (1.1.2)    
* fastrmodels (1.0.2)   * munsell     (0.5.0)    * tzdb         (0.2.0)    
* furrr       (0.2.3)   * nlme        (3.1-155)  * utf8         (1.2.2)    
* future      (1.24.0)  * openssl     (2.0.0)    * vctrs        (0.4.1)    
* generics    (0.1.3)   * parallelly  (1.30.0)   * viridisLite  (0.4.0)    
* ggplot2     (3.3.6)   * pillar      (1.7.0)    * vroom        (1.5.7)    
* globals     (0.15.1)  * pkgconfig   (2.0.3)    * withr        (2.5.0)    
* glue        (1.6.2)   * prettyunits (1.1.1)    * xgboost      (1.5.2.1)  
* gsubfn      (0.7)     * progress    (1.2.2)      
---------------------------------------------------------

Screenshots

No response

Additional context

No response

john-b-edwards commented 1 year ago

This has been resolved:

nflreadr::load_participation(
    seasons = TRUE,
    include_pbp = TRUE) |>
    dplyr::filter(nflverse_game_id %in% c("2021_17_NYG_CHI","2021_06_GB_CHI","2020_09_PIT_DAL")) |>
    dplyr::count(nflverse_game_id)
#> ── nflverse play-by-play participation ─────────────────────────────────────────
#> ℹ Data updated: 2023-09-12 15:08:01 PDT
#> # A tibble: 3 × 2
#>   nflverse_game_id     n
#>   <chr>            <int>
#> 1 2020_09_PIT_DAL    185
#> 2 2021_06_GB_CHI     171
#> 3 2021_17_NYG_CHI    173