steodose / RStudio-Table-Contest

Repo for the 2021 RStudio Table Contest
4 stars 1 forks source link

Table does not reflect the real matches played #1

Open jesbrz opened 1 year ago

jesbrz commented 1 year ago

Hello Stephan, sorry for bothering with an old topic.

I'm trying to replicate the table with data from the Spanish second division, I've totally succeeded, but I have a problem with these two lines:

matchweek <- 24 # Specify how many full matchweeks have been played.
last_week <- matchweek - 1

When I enter the match weeks that have already been played (24 in this division), the final graph only shows me 17 (Eibar now have 46 points and so on).

Rplot

I have tried to look for this behaviour, but I can't find where the problem could be.

I would be grateful if you could give me a hand. In any case, congratulations on your work.

jesbrz commented 1 year ago

Hi again. I have found where the problem begins, and also happens with the original code. In this case, when try to make the 2021-2022 table (all the games), the code only shows 32 of 38 games. This is the 'problematic' code: `# Function to extract Premier League match results data from FBREF EPL_2022 <- get_match_results(country = "ENG", gender = "M", season_end_year = 2022, tier = "1st")

Load team mapping file

team_mapping <- "https://raw.githubusercontent.com/steodose/Club-Soccer-Forecasts/main/team_mapping.csv" %>% read_csv()

matchweek <- 38 # Specify how many full matchweeks have been played last_week <- matchweek - 1

games_df <- EPL_2022 %>% filter(Wk <= matchweek) %>% mutate(Result = HomeGoals - AwayGoals) %>% select(Home, Away, Result, Wk, HomeGoals, AwayGoals, Home_xG, Away_xG) %>% pivot_longer(Home:Away, names_to = "home_away", values_to = "Team") %>% mutate( Result = ifelse(home_away == "Home", Result, -Result), win = ifelse(Result == 0, 0.5, ifelse(Result > 0, 1, 0)) ) %>% select(Wk, Team, HomeGoals, AwayGoals, win, Result) %>% drop_na()

team_mapping2 <- team_mapping %>% select(squad_fbref, url_logo_espn)

joined_df <- games_df %>% group_by(Team) %>% summarise( Wins = length(win[win == 1]), Losses = length(win[win == 0]), Draws = length(win[win == 0.5]), MP = sum(Wins, Losses, Draws), Points = (Wins 3) + (Draws 1), Points Percentage = (100 Points / (MP 3)), GD = sum(Result), form = list(win), .groups = "drop" ) %>% left_join(team_mapping2, by = c("Team" = "squad_fbref")) %>% select(url_logo_espn, Team, Points, MP, Wins, Draws, Losses, GD, Points Percentage, form) %>% arrange(desc(Points), desc(GD)) %>% ungroup() %>% mutate(Rank = row_number()) %>% relocate(Rank) %>% rename(Squad = Team) %>% mutate(list_data = list(c(Wins, Draws, Losses)))`

The data frame games_df is OK, but when joined to make joined_df is when the error occurs. My knowledge is not sufficient to find a solution.