probberechts / soccerdata

⛏⚽ Scrape soccer data from Club Elo, ESPN, FBref, FiveThirtyEight, Football-Data.co.uk, FotMob, Sofascore, SoFIFA, Understat and WhoScored.
https://soccerdata.readthedocs.io/en/latest/
Other
526 stars 89 forks source link

[ESPN] Player Infos not retrievable #482

Closed JJOSHTECH closed 4 months ago

JJOSHTECH commented 4 months ago

I encounter an error in ESPN.py (Python Version: 3.11.7; OS: Windows: 11; soccerdata: 1.5.3):

<path>\.venv\Lib\site-packages\soccerdata\fbref.py:674: FutureWarning: The behavior of DataFrame concatenation with empty or all-NA entries is deprecated. In a future version, this will no longer exclude empty or all-NA columns when determining the result dtypes. To retain the old behavior, exclude the relevant entries before the concat operation.
  pd.concat(schedule)
[02/12/24 18:48:52] INFO     No lineup info found for team 1 in game with ID=671269                                                                             espn.py:262
                    INFO     No lineup info found for team 2 in game with ID=671269                                                                             espn.py:262
Traceback (most recent call last):
  File "<path>main.py", line 258, in <module>
    main()
  File "<path>main.py", line 84, in main
    lineups = espn.read_lineup(match_id=copy.values)
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<path>.venv\Lib\site-packages\soccerdata\espn.py", line 300, in read_lineup
    ii = [i for i, x in enumerate(p["plays"]) if x["substitution"]][j]
                                  ~^^^^^^^^^
KeyError: 'plays'

I am retrieving all the lineups and in match. The issue occurs in match 643973. The Problem is the Structur of one players Data that looks as follows:

{'active': True, 'starter': True, 'jersey': '13', 'athlete': {'id': '175149', 'uid': 's:600~a:175149', 'guid': 'ec62cfce-f0d0-3a90-7e94-7a2ba13a1336', 'lastName': 'Bounou', 'fullName': 'Yassine Bounou', 'displayName': 'Yassine Bounou', 'links': [{'language': 'en-US', 'rel': ['playercard', 'desktop', 'athlete'], 'href': 'http://www.espn.com/soccer/player/_/id/175149/yassine-bounou', 'text': 'Player Card', 'shortText': 'Player Card', 'isExternal': False, 'isPremium': False}], 'jersey': '13', 'position': {'$ref': 'http://sports.core.api.espn.pvt/v2/sports/soccer/leagues/esp.1/positions/1?lang=en&region=us', 'id': '1', 'name': 'Goalkeeper', 'displayName': 'Goalkeeper', 'abbreviation': 'G', 'leaf': True}}, 'position': {'id': '1', 'name': 'Goalkeeper', 'displayName': 'Goalkeeper', 'abbreviation': 'G'}, 'subbedIn': {'didSub': False}, 'subbedOut': {'didSub': False}, 'formationPlace': '1'}

Especially this Part is odd: 'subbedOut': {'didSub': False} in other Playersdata this is directly a boolean and there are no sub fields. My solution is now to change ESPN.py at line 298 as follows: elif p["subbedOut"] and isinstance(p["subbedOut"], bool):

I recognized the same error for subbedIn as well. So i changed line 282 as well: elif p["subbedIn"] and isinstance(p["subbedIn"], bool):

Does someone has a better solution?