nflverse / nflreadr

Efficiently download nflverse data
https://nflreadr.nflverse.com/
Other
62 stars 12 forks source link

[BUG] Michael Carter mapping bug #251

Closed TheMathNinja closed 2 months ago

TheMathNinja commented 3 months ago

Is there an existing issue for this?

Have you installed the latest development version of the package(s) in question?

If this is a data issue, have you tried clearing your nflverse cache?

I have cleared my nflverse cache and the issue persists.

What version of the package do you have?

‘1.4.1.0’

Describe the bug

In load_ff_playerids() there is an error in Michael Carter mappings. At the very least, I know MFL and ESPN maps are reversed but I have no idea how far it goes for the other maps and which are which.

Arizona RB Michael Carter should be MFL 15258 and ESPN 4240657 Jets CB Michael Carter should be MFL 15432 and ESPN 4240456

These are currently reversed in load_ff_playerids().

Reprex

load_ff_playerids()

Expected Behavior

I was mapping and got weird reversals on these two rather than a match.

nflverse_sitrep

── System Info ────────────────────────────────────────────────────────────────
• R version 4.4.1 (2024-06-14 ucrt) • Running under: Windows 11 x64 (build 22631)
── Package Status ─────────────────────────────────────────────────────────────
   package  installed  cran        dev   behind
1   nfl4th 1.0.4.9002 1.0.4 1.0.4.9002         
2 nflfastR 4.6.1.9013 4.6.1 4.6.1.9013         
3 nflplotR 1.3.1.9001 1.4.0      1.4.0 cran;dev
4 nflreadr   1.4.1.00 1.4.1   1.4.1.00         
5 nflseedR 1.2.0.9001 1.2.0 1.2.0.9001         
6 nflverse      1.0.3 1.0.3      1.0.3         
── Package Options ────────────────────────────────────────────────────────────
• No options set for above packages
── Package Dependencies ───────────────────────────────────────────────────────
• askpass     (1.2.0)    • htmlwidgets  (1.6.4)   • snakecase   (0.11.1)    
• backports   (1.5.0)    • httr         (1.4.7)   • stringi     (1.8.4)     
• base64enc   (0.1-3)    • isoband      (0.2.7)   • stringr     (1.5.1)     
• bigD        (0.2.0)    • janitor      (2.2.0)   • sys         (3.4.2)     
• bitops      (1.0-7)    • jquerylib    (0.1.4)   • tibble      (3.2.1)     
• bslib       (0.7.0)    • jsonlite     (1.8.8)   • tidyr       (1.3.1)     
• cachem      (1.1.0)    • juicyjuice   (0.1.0)   • tidyselect  (1.2.1)     
• cli         (3.6.3)    • knitr        (1.47)    • timechange  (0.3.0)     
• colorspace  (2.1-0)    • labeling     (0.4.3)   • tinytex     (0.51)      
• commonmark  (1.9.1)    • lifecycle    (1.0.4)   • utf8        (1.2.4)     
• cpp11       (0.4.7)    • listenv      (0.9.1)   • V8          (4.4.2)     
• crayon      (1.5.3)    • lubridate    (1.9.3)   • vctrs       (0.6.5)     
• curl        (5.2.1)    • magick       (2.8.3)   • viridisLite (0.4.2)     
• data.table  (1.15.4)   • magrittr     (2.0.3)   • withr       (3.0.1)     
• digest      (0.6.36)   • markdown     (1.13)    • xfun        (0.45)      
• dplyr       (1.1.4)    • memoise      (2.0.1)   • xgboost     (1.7.7.1)   
• evaluate    (0.24.0)   • mime         (0.12)    • xml2        (1.3.6)     
• fansi       (1.0.6)    • munsell      (0.5.1)   • yaml        (2.3.8)     
• farver      (2.1.2)    • openssl      (2.2.0)   • codetools   (0.2-20)    
• fastmap     (1.2.0)    • parallelly   (1.37.1)  • compiler    (4.4.1)     
• fastrmodels (1.0.2)    • pillar       (1.9.0)   • graphics    (4.4.1)     
• fontawesome (0.5.2)    • pkgconfig    (2.0.3)   • grDevices   (4.4.1)     
• fs          (1.6.4)    • progressr    (0.14.0)  • grid        (4.4.1)     
• furrr       (0.3.1)    • proto        (1.0.0)   • lattice     (0.22-6)    
• future      (1.33.2)   • purrr        (1.0.2)   • MASS        (7.3-60.2)  
• generics    (0.1.3)    • R6           (2.5.1)   • Matrix      (1.7-0)     
• ggpath      (1.0.1)    • rappdirs     (0.3.3)   • methods     (4.4.1)     
• ggplot2     (3.5.1)    • RColorBrewer (1.1-3)   • mgcv        (1.9-1)     
• globals     (0.16.3)   • Rcpp         (1.0.12)  • nlme        (3.1-164)   
• glue        (1.7.0)    • reactable    (0.4.4)   • parallel    (4.4.1)     
• gsubfn      (0.7)      • reactR       (0.6.0)   • splines     (4.4.1)     
• gt          (0.10.1)   • rlang        (1.1.4)   • stats       (4.4.1)     
• gtable      (0.3.5)    • rmarkdown    (2.27)    • tools       (4.4.1)     
• highr       (0.11)     • rstudioapi   (0.16.0)  • utils       (4.4.1)     
• hms         (1.1.3)    • sass         (0.4.9)     
• htmltools   (0.5.8.1)  • scales       (1.3.0)     
───────────────────────────────────────────────────────────────────────────────

Screenshots

No response

Additional context

No response

TheMathNinja commented 3 months ago

I caught another espn_id issue in load_ff_playerids(): Ramiz Ahmed's espn_id should be 4266790, not 4243315. He has been wrongly assigned Salvon Ahmed's espn_id. Perhaps this can be caught with an additional check on the creation of this df will be to check for any duplicates in ID in a given platform's numbers.

I'd be glad to submit a PR for this, but I think this lives in dynastyprocess's JSON automated pipeline, so I'm guessing this isn't a simple csv fix like it was in the past with missing_ids.csv.

tanho63 commented 3 months ago

Both of these should be resolved in dynastyprocess/data's missing_ids.json: https://github.com/dynastyprocess/data/blob/master/files/missing_ids.json

You can manually add new entries at the bottom of the file, ensuring they always have mfl_id which serves as the primary key.