probberechts / soccerdata

⛏⚽ Scrape soccer data from Club Elo, ESPN, FBref, FiveThirtyEight, Football-Data.co.uk, FotMob, Sofascore, SoFIFA, Understat and WhoScored.
https://soccerdata.readthedocs.io/en/latest/
Other
529 stars 90 forks source link

Add support for scraping FotMob #461

Closed marcjbaron closed 5 months ago

marcjbaron commented 5 months ago

Hello,

This is an attempt at adding Fotmob as a data source to soccerdata. The nox test suite passed without errors, with the exception of the python-3.8 tests (but the current python dependency in the pyproject.toml file is >=3.9, so I don't know if this can be safely ignored). Unit tests were also created and passed successfully.

The current methods are:

There are obviously other methods that could be added (e.g. season stats), but did not want to go much further until early issues are dealt with.

marcjbaron commented 5 months ago

I have made the requested changes (and a few additional changes for clarity).

If/when the PR is approved, I also have a "read_team_season_stats" method ready.

probberechts commented 5 months ago

Something seems to be wrong with the poetry.lock file. Which version of poetry are you currently using? I am on 1.7.1. Also, can you restore the lock file to the original one? Feature PR's should not change the versions of dependencies.

marcjbaron commented 5 months ago

Lock file should now match the original.

probberechts commented 5 months ago

Thanks. I did some refactoring, fixed a few more bugs and simplified the columns in the output dataframes. The only thing that still does not seem to work is retrieving the table for tournaments. Currently, it gets the ranking of a single group only.

marcjbaron commented 5 months ago

Thanks for cleaning up the code.

The tournament tables are potentially fixed: I added a column called "group" to the each table; for leagues without a group stage, the column will return 'NaN'.

On that note, for previously completed seasons, FotMob's api gives 'null' for the round and week in the fixtures list, so these are returned as 'None' in the read_schedule method.

I'm not sure if either of these are desirable though.

codecov-commenter commented 5 months ago

Codecov Report

Attention: 158 lines in your changes are missing coverage. Please review.

Comparison is base (f0ff1c0) 65.24% compared to head (869f891) 59.33%. Report is 1 commits behind head on master.

:exclamation: Current head 869f891 differs from pull request most recent head 57cb615. Consider uploading reports for the commit 57cb615 to get more accurate results

Files Patch % Lines
soccerdata/fotmob.py 10.22% 158 Missing :warning:

:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #461 +/- ## ========================================== - Coverage 65.24% 59.33% -5.91% ========================================== Files 10 11 +1 Lines 1456 1633 +177 Branches 301 336 +35 ========================================== + Hits 950 969 +19 - Misses 452 610 +158 Partials 54 54 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.