probberechts / soccerdata

⛏⚽ Scrape soccer data from Club Elo, ESPN, FBref, FiveThirtyEight, Football-Data.co.uk, FotMob, Sofascore, SoFIFA, Understat and WhoScored.
https://soccerdata.readthedocs.io/en/latest/
Other
529 stars 90 forks source link

[FBref] German League 2122: Length of values (53) does not match length of index (54) #354

Open LuisEnriqueKaiser opened 10 months ago

LuisEnriqueKaiser commented 10 months ago

Hello everyone,

I want to scrape match data with the fbref.read_team_match_stats() function and it does work good for all leagues except for the german league. More specifically, it does not work for the 21-22 season.

My python version is 3.11 and I am using the latest soccerdata distribution. I will attach a screenshot of the error message and of my code.

Screenshot 2023-09-06 at 10 06 59 Screenshot 2023-09-06 at 10 08 11

Kind regards

probberechts commented 10 months ago

It looks like an inconsistency in the FBref website. The "match_report" stat is missing for some game of some team in the season. You can try to make the following snippet a bit more robust:

https://github.com/probberechts/soccerdata/blob/f49cdf14fd184f3535903a1cdc0336e3098b29f0/soccerdata/fbref.py#L644-L649

TimelessUsername commented 9 months ago

Similar issue with WhoScored, I'm currently trying to find out why 21-22 season fails to match the league...

Edit:

As the error is; KeyError: "[('ARG-Liga Profesional', '2122')] not in index", while input years are of the form [15, 16, ... , 20, 21, 22, 23], the string conversion logic seems to fail and thus fail to match the year. The 2122 form needs to be converted to 2022, I recon.

Edit2:

Various issues with different input formats too, the logic needs a bit of work I think.

Edit3: Appears whoscore doesnt have some of the years during the virus...