danmorse314 / hockeyR

Collect and Clean Hockey Stats
https://hockeyr.netlify.app/
Other
45 stars 7 forks source link

March 2024 NHL API Update Impacting scrape_game #16

Open RealZLock opened 6 months ago

RealZLock commented 6 months ago

hi, a few other scrapers on twitter have discussed this, but this week there was an update to the NHL API that has moved some things like play-by-play period number and some boxscore information. Getting "WARNING: src/learner.cc:1517: Empty dataset at worker: 0" when using the scrape_game function and just have a ton of blank columns, most important being game_seconds and period on the play-by-play data.

thanks!

RealZLock commented 6 months ago

this adjustment to scrape_game function fixes it as far as i can tell

unnest game plays

plays <- site$plays %>% dplyr::tibble() %>% tidyr::unnest_wider(1) %>% dplyr::select(-c(typeCode)) %>% tidyr::unnest_wider(details) %>% tidyr::unnest_wider(periodDescriptor) %>% dplyr::rename(period=number)

Saarialho commented 6 months ago

Might be related to this, I see 53 col names that are not in 2023 data and 28 col names from 2023 that are not in 2024 data. The xG variable, for example, is missing totally from the Mar 9, 2024 updated pbp data