jthomasmock / espnscrapeR

Scrapes Or Collects NFL Data From ESPN
https://jthomasmock.github.io/espnscrapeR/
Other
51 stars 10 forks source link

2022 Pass Win Rates Update #19

Closed becausejustyn closed 1 year ago

becausejustyn commented 1 year ago

Currently the 2022 season does not work with the latest version of the package, I added it here.

scrape_espn_win_rate <- function(season = 2022) {
  if (!(as.numeric(season) %in% c(2019:2022))) {
    stop("Data available for 2021-22")
  }
  pbwr_url <- "https://www.espn.com.au/nfl/story/_/id/34536376/2022-nfl-pass-rushing-run-stopping-blocking-leaderboard-win-rate-rankings"
  pbwr_2021 <- "https://www.espn.com/nfl/story/_/id/32176833/2021-nfl-pass-rushing-run-stopping-blocking-leaderboard-win-rate-rankings"
  pbwr_2020 <- "https://www.espn.com/nfl/story/_/id/29939464/2020-nfl-pass-rushing-run-stopping-blocking-leaderboard-win-rate-rankings"
  pbwr_2019 <- "https://www.espn.com/nfl/story/_/id/27584726/nfl-pass-blocking-pass-rushing-rankings-2019-pbwr-prwr-leaderboard#prwrteam"
  pbwr_2018 <- "https://www.espn.com/nfl/story/_/id/25074144/nfl-pass-blocking-pass-rushing-stats-final-leaderboard-pass-block-win-rate-pass-rush-win-rate"
  stats_in <- c(
    "Pass Rush Win Rate", "Run Stop Win Rate",
    "Pass Block Win Rate", "Run Block Win Rate"
  )
  stat_2019 <- c("Pass Rush Win Rate", "Pass Block Win Rate")
  raw_html <- rvest::read_html(case_when(
    season == 2019 ~ pbwr_2019,
    season == 2020 ~ pbwr_2020,
    season == 2021 ~ pbwr_2021,
    season == 2022 ~ pbwr_url
  ))
  date_updated <- raw_html %>%
    rvest::html_node("#article-feed > article:nth-child(1) > div > div.article-body > div.article-meta > span > span") %>%
    rvest::html_text()
  raw_text <- raw_html %>%
    rvest::html_nodes("#article-feed > article:nth-child(1) > div > div.article-body > p") %>%
    rvest::html_text()
  tibble::enframe(raw_text) %>%
    filter(str_detect(value, "1. ")) %>%
    mutate(name = if_else(season == 2019, list(stat_2019),
      list(stats_in)
    )[[1]]) %>%
    mutate(value = str_split(
      value,
      "\n"
    )) %>%
    unnest_longer(value) %>%
    separate(value, into = c(
      "rank",
      "team", "win_pct"
    ), sep = "\\. |, ") %>%
    mutate(
      rank = as.integer(rank),
      win_pct = str_remove(win_pct, "%"), win_pct = as.double(win_pct),
      date_updated = date_updated, season = season
    ) %>%
    rename(
      stat = name,
      stat_rank = rank
    )
}
jthomasmock commented 1 year ago

I think this was solved in a fix I pushed yesterday.

https://github.com/jthomasmock/espnscrapeR/commit/697bb7512e98e58308c4d49eb26806c99f6b21c4

Can you try with latest release?

becausejustyn commented 1 year ago

Tried on a different computer and it now works. Thanks for everything you do! :)