nflverse / nflreadr

Efficiently download nflverse data
https://nflreadr.nflverse.com/
Other
58 stars 12 forks source link

[BUG] <Missing Rows in load_player_stats()> #204

Closed TheMathNinja closed 1 year ago

TheMathNinja commented 1 year ago

Is there an existing issue for this?

Have you installed the latest development version of the package(s) in question?

What version of the package do you have?

1.3.2.9

Describe the bug

load_player_stats() is returning empty rows when a player plays but records no relevant box score statistics.

Reprex

nflreadr::load_player_stats(
+     season = 2021:2022,
+     stat_type = c("defense"))

Expected Behavior

I expect this to return a row for every week a player played football. But this function entirely omits weeks a player plays but records no box score statistics. Some top DT examples in 2022: Jonathan Allen Week 7 Deforest Buckner Week 4 Chris Jones Week 15

I noticed this is also an issue for Offense. Tee Higgins doesn't get a row in Week 5 or Week 14 of 2022 even though he played both weeks.

This causes issues when calculating something like Receptions Per Game across a season (the Games denominator is wrong).

nflverse_sitrep

── System Info ────────────────────────────────────────────────────────────
• R version 4.3.1 (2023-06-16 ucrt) • Running under: Windows 11 x64 (build 22621)
── Package Status ─────────────────────────────────────────────────────────
• nflreadr (1.3.2.09)    • nflseedR (1.2.0)  • nflplotR (1.1.0.9006)  
• nflfastR (4.5.1.9004)  • nfl4th   (1.0.4)  • nflverse (1.0.3)       
── Package Options ────────────────────────────────────────────────────────
No options set for above packages
── Package Dependencies ───────────────────────────────────────────────────
• askpass     (1.1)     • hms        (1.1.3)    • proto        (1.0.0)    
• backports   (1.4.1)   • httr       (1.4.7)    • purrr        (1.0.2)    
• cachem      (1.0.8)   • isoband    (0.2.7)    • R6           (2.5.1)    
• cli         (3.6.1)   • janitor    (2.2.0)    • rappdirs     (0.3.3)    
• codetools   (0.2-19)  • jsonlite   (1.8.7)    • RColorBrewer (1.1-3)    
• colorspace  (2.1-0)   • labeling   (0.4.2)    • Rcpp         (1.0.11)   
• cpp11       (0.4.6)   • lattice    (0.21-8)   • rlang        (1.1.1)    
• crayon      (1.5.2)   • lifecycle  (1.0.3)    • rstudioapi   (0.15.0)   
• curl        (5.0.2)   • listenv    (0.9.0)    • scales       (1.2.1)    
• data.table  (1.14.8)  • lubridate  (1.9.2)    • snakecase    (0.11.1)   
• digest      (0.6.33)  • magick     (2.7.4)    • stringi      (1.7.12)   
• dplyr       (1.1.2)   • magrittr   (2.0.3)    • stringr      (1.5.0)    
• fansi       (1.0.4)   • MASS       (7.3-60)   • sys          (3.4.2)    
• farver      (2.1.1)   • Matrix     (1.6-1)    • tibble       (3.2.1)    
• fastmap     (1.1.1)   • memoise    (2.0.1)    • tidyr        (1.3.0)    
• fastrmodels (1.0.2)   • mgcv       (1.8-42)   • tidyselect   (1.2.0)    
• furrr       (0.3.1)   • mime       (0.12)     • timechange   (0.2.0)    
• future      (1.33.0)  • munsell    (0.5.0)    • utf8         (1.2.3)    
• generics    (0.1.3)   • nlme       (3.1-162)  • vctrs        (0.6.3)    
• ggplot2     (3.4.2)   • openssl    (2.1.0)    • viridisLite  (0.4.2)    
• globals     (0.16.2)  • parallelly (1.36.0)   • withr        (2.5.0)    
• glue        (1.6.2)   • pillar     (1.9.0)    • xgboost      (1.7.5.1)  
• gsubfn      (0.7)     • pkgconfig  (2.0.3)      
• gtable      (0.3.3)   • progressr  (0.13.0)     
───────────────────────────────────────────────────────────────────────────

Screenshots

No response

Additional context

I'm wondering if joining participation data might help on this (include snaps as a variable for offense and defense?) but I'm guessing that creates more dependencies which might not be desirable.

mrcaseb commented 1 year ago

This is not a bug as nflfastR can only use pbp data to compute stats. If a player doesn't record any stat in the pbp data, there is no way to count their games correctly.

I suggest participation data kr maybe PFR stats to count games of players more precisely.