valentinumbach / SoFIFA

SoFIFA API for R
MIT License
9 stars 3 forks source link

Scraper not returning correct info #1

Open valentinumbach opened 5 years ago

valentinumbach commented 5 years ago

get_teams(), get_players(), and get_player_scores() only or mostly return NA now instead of id's or scores. Need to check the scraper again.

magnusroesbjerg commented 5 years ago

Hi Valentinum, Do you think you will have the time to look at this issue in the near future? I am about to scrape the SoFIFA webpage as part of an assignment. If you think you will have the time to look at the scraper within a few weeks, it would be waste of time for me to build my own scraper from scratch (except for the learnings, of course). I have been looking at your code, but I'm not sure if my programming skills are not good enough for debugging. Thanks! / Magnus

valentinumbach commented 5 years ago

Hi Magnus,

I had a quick look and found that get_teams() and get_players() are easily fixed with an updated regex, as there were just some minor formatting changes. However, the get_player_scores() will need some major overhaul, because apparently sofifa.com is now using dynamic JavaScript loading ("lazy loading"). This means you would likely have to implement some headless browser, such as PhantomJS or Selenium. I probably won't be able to do this in the near future. If you can implement a solution based on my package, I would greatly appreciate a pull request :)

Best, Valentin