JaseZiv / worldfootballR

A wrapper for extracting world football (soccer) data from FBref, Transfermark, Understat
https://jaseziv.github.io/worldfootballR/
472 stars 61 forks source link

Add info from whoscored #394

Open jesbrz opened 2 months ago

jesbrz commented 2 months ago

I would like to know if it is possible to add the functions of the website https://www.whoscored.com to worldfootballR. It has interesting information that could be very useful for analysis.

Regards.

tonyelhabr commented 2 months ago

WhoScored has good data, I agree. However, I just don't see it as too practical. WhoScored loads webpage data on the client side, which means we'd probably need to use something like Selenium to get the data. We've avoided using Selenium in this package for at least 2 reasons:

  1. to simplify dependencies (both package and OS)
  2. to prevent having "frail" code
    • For example, Selenium with R is fairly prone to leaving open connections, which can lead to mysterious OOM errors. This can be avoided with smart error handling, but that puts a lot more responsibility on the package developers to write really robust code. @JaseZiv and I strive to do this, but we're also not spending enough time on package development to guarantee this. (Just look at the package source, and you can see lots of ugly code 😅 !)

If not Selenium, other options are:

I haven't explored these. These may indeed make scraping easy.

JaseZiv commented 2 months ago

Totally echo Tony's statements... have purposefully stayed away from this site due to the somewhat flimsy nature of browser automation scraping.

Happy to leave it for future consideration, but wouldn't imagine this is something we address in the near future unfortunately.