probberechts / soccerdata

⛏⚽ Scrape soccer data from Club Elo, ESPN, FBref, FiveThirtyEight, Football-Data.co.uk, FotMob, Sofascore, SoFIFA, Understat and WhoScored.
https://soccerdata.readthedocs.io/en/latest/
Other
573 stars 101 forks source link

[FBref] match-level opponent data not available for read_team_match_stats #370

Closed mhd0528 closed 11 months ago

mhd0528 commented 11 months ago

Hi, is getting opponent team info from read_team_match_statsstill supported? I got the following error. Any suggestions how to fix this? Thanks in advance!

File "/home/anaconda3/envs/soccerdata/lib/python3.10/site-packages/soccerdata/fbref.py", line 432, in read_team_match_stats
    (html_table,) = tree.xpath(f"//table[@id='matchlogs_{opp_type}']")
ValueError: not enough values to unpack (expected 1, got 0)
probberechts commented 11 months ago

They are simply not available for the "schedule" stat type

mhd0528 commented 11 months ago

Ok thanks! I'll look into other types of stats first then.

mhd0528 commented 11 months ago

Okay so I tried to disable the local cache and with other stat types other than schedule, but it still gives me a scraping error like below:

Error while scraping                                                         _common.py:351
                             [](https://fbref.com/en/squads/18bb7c10/matchlogs/all_comps/shooting.)

And I checked this url, it is not valid in FBref. The closest valid link I can find is: https://fbref.com/en/squads/18bb7c10/2023-2024/matchlogs/all_comps/shooting/Arsenal-Match-Logs-All-Competitions. Is it possible that there is some error in the url generation? Sorry for bothering you recently, I'm trying to figure out what's the error on my end, but can't solve it...

mhd0528 commented 11 months ago

I think I might find out what's the problem. When I scrape data before the current season, e.g. fbref = sd.FBref(leagues='ENG-Premier League', seasons='2022-23'), I'm able to get team_match data with those stat_types. But if I use fbref = sd.FBref(leagues='ENG-Premier League', seasons='2023-24'), it will give me the error above.

---update--- I checked the team URLs of the current season and previous seasons. The URL for the current season doesn't have season in it. See the example below: 2022-23: Arsenal /en/squads/18bb7c10/2022-2023/Arsenal-Stats 2023-24: Arsenal /en/squads/18bb7c10/Arsenal-Stats This will lead to an invalid URL for the team match log. I'll try to add a condition to deal with it.

---update--- I have updated the code in my fork and I'm testing it with more history seasons + leagues. I'll pull a merge request once it's verified.

---update--- I have tested the code and it should be good now. Create a PR for it. Hope it help. https://github.com/probberechts/soccerdata/pull/384