iptc / sport-schema

The next generation of sports data, based on IPTC’s SportsML and semantic web principles
12 stars 1 forks source link

Consistent example data for a single sport, league, and team #144

Open awanczowski opened 1 year ago

awanczowski commented 1 year ago

Produce sample for a single sport, league, and team. The items samples should include Team Roster, Schedule, Standings/Positions, and Event. This will enable a coherent story across multiple samples.

bquinn commented 5 months ago

one example that we found is:

This came up as a suggested paper to read from a slightly odd service that sends me random suggestions. but this one might be useful: https://www.academia.edu/109690627/Soccer2014DS_a_dataset_containing_player_events_from_the_2014_World_Cup?email_work_card=title they reverse-engineered Opta data from Huffpost to create a data set of player events for the 2014 Soccer World Cup. The tool is available on github and so is the extracted data! Maybe we could convert it to sport schema…?

bquinn commented 5 months ago

Other potential data sets that we could convert: https://footballcsv.github.io/ https://github.com/openfootball/ https://sports-statistics.com/sports-data/sports-data-sets-for-data-modeling-visualization-predictions-machine-learning/ https://github.com/openfootball/worldcup/ https://www.football-data.co.uk/ https://github.com/streampref/wcimport/tree/master/data (the data referenced in the academic article above) Several on Kaggle: https://www.kaggle.com/datasets/saife245/english-premier-league (data comes from https://football-data.co.uk/)

bquinn commented 5 months ago

Thought from today: if we're doing something like a recent World Cup, we could just use wikidata IDs for all players. (We would need to check but all players should have wikidata IDs)

bquinn commented 5 months ago

Looking at the datasets:

None of the free/open data sets seem to have player lineups...?

bquinn commented 5 months ago

These people https://www.sportmonks.com/football-api/ seem to give away Danish superliga and Scottish league data for free... could we use that?