JaseZiv / worldfootballR

A wrapper for extracting world football (soccer) data from FBref, Transfermark, Understat
https://jaseziv.github.io/worldfootballR/
444 stars 60 forks source link

Unable to extract from Everton vs chelsea #157

Closed ShironM2302 closed 1 year ago

ShironM2302 commented 1 year ago

Unable to extract from Everton vs chelsea

Code: advanced_match_stats1 <- get_advanced_match_stats(match_url = "https://fbref.com/en/matches/3a917cee/Everton-Chelsea-August-6-2022-Premier-League", stat_type = "possession", team_or_player = "player")

Error in cbind(League_URL, Match_Date, Matchweek, Home_Team, Home_Formation, : object 'Home_Goals' not found

Thanks again for the all great work on this project!

jackgovier commented 1 year ago

Not certain if it's the same error, but believe it is.

I tried to run: get_match_summary( "https://fbref.com/en/matches/2daea068/West-Ham-United-Manchester-United-September-19-2021-Premier-League")

And get a blank df (and for any other PL match I tried). I tried running through the function step by step and it looked like the issue is the 'seasons' section at the end. League URL is being returned as: https://fbref.com/en/comps/9/9/9-Stats Rather than any of the urls in the csv, so is just being filtered out.

Running trace(get_match_summary, edit=TRUE) and removing these lines: seasons <- read.csv("https://raw.githubusercontent.com/JaseZiv/worldfootballR_data/master/raw-data/all_leages_and_cups/all_competitions.csv", stringsAsFactors = F) seasons <- seasons %>% dplyr::filter(.data$seasons_urls %in% all_events_df$League_URL) %>% dplyr::select(League = .data$competition_name, Gender = .data$gender, Country = .data$country, Season = .data$seasons, League_URL = .data$seasons_urls) all_events_df <- seasons %>% dplyr::left_join(all_events_df, by = "League_URL") %>% dplyr::select(-.data$League_URL) %>% dplyr::distinct(.keep_all = T)

Gave me a short term fix.

JaseZiv commented 1 year ago

Thanks all for raising this issue...

It's going to require some investigating on how to approach a fix... The issue stems from the league link on each match page (see image below) being inconsistent with the actual league URL. This becomes a problem because the function joins league information to the match data using this link and because it's inconsistent, can't find what it needs:

image image

The URL should be https://fbref.com/en/comps/9/Premier-League-Stats.

Will keep you posted on progress

JaseZiv commented 1 year ago

This issue has been somewhat resolved now with the most recent dev version release (0.5.11.1000).

The get_ functions have now been replaced with fb_.

Additionally, the following functions now no longer return league/season metadata, including columns League, Gender, Country, Season:

fb_match_summary() fb_advanced_match_stats() fb_match_report()

Please let me know if you're seeing more problems.