probberechts / soccerdata

⛏⚽ Scrape soccer data from Club Elo, ESPN, FBref, FiveThirtyEight, Football-Data.co.uk, FotMob, Sofascore, SoFIFA, Understat and WhoScored.
https://soccerdata.readthedocs.io/en/latest/
Other
511 stars 87 forks source link

No unique player identifier #597

Closed ProjectPear100 closed 2 weeks ago

ProjectPear100 commented 1 month ago

Is there any way we can join player data from multiple sources (FBref, WhoScored, SoFIFA) using an identifier or unique key of some sort? I have gone through the documentation and was unable to find any method to identify a unique player. Joining on names wont work since multiple players could have same names and also they are spelled differently across sources.

probberechts commented 1 month ago

You'll have to programatically and/or manually map the player ids between different data sources. There are no shortcuts and soccerdata does not provide any support for doing this. In the future, I consider adding support for replacing player names / IDs by a standardized name / ID given a mapping (similarly to what is already possible for teams using the config/teamname_replacements.json file). But I consider creating the mapping itself out of scope for this package.

For some tips on how to match player IDs across multiple data sources, I can recommend this blog post: https://unravelsports.github.io/2022/07/11/player-id-matching-system.html