alan-turing-institute / AIrsenal

Machine learning Fantasy Premier League team
MIT License
295 stars 87 forks source link

Add command line interface for scraping Transfermarkt data #532

Open rchan26 opened 2 years ago

rchan26 commented 2 years ago

Currently there is no command line interface for scraping data from Transfermarkt. The script to scrape the data can be found in here. Currently, we would just run the script directly, but it would be nice to have a command line interface where we can run the script and enter in the season for which to scrape, e.g. just running airsenal_scrape_tm --season=2223. This would make scraping the data much more convenient.

You'll need to edit the pyproject.toml file to include the script and add the command line interface in airsenal/scripts/scrape_transfermarkt.py.

An example of this can be found in airsenal/scripts/fill_db_init.py, which can be ran on the command line using airsenal_setup_initial_db.

BassCoder2808 commented 2 years ago

Hi @rchan26, can I contribute to this issue, if it is still available?

rchan26 commented 2 years ago

Hi @BassCoder2808 - yes it is available, it'd be great if you can help us out with this! 😄

Let me know if you have any specific questions along the way!

BassCoder2808 commented 2 years ago

Thanks a lot @rchan26 . I will start working on it.

BassCoder2808 commented 2 years ago

Hi @rchan26 , I have added the code for the command line scraping, but currently, the only function it runs is scraping data for the absentees from the get_season_absences function. I wanted to ask, which all functions this should implement like do we want only the absentee's data or other data as well?