BradleyGrantham / pl-predictions-using-fifa

Training a neural network to predict the outcome of a football match using fifa ratings
https://towardsdatascience.com/predicting-premier-league-odds-from-ea-player-bfdb52597392
Mozilla Public License 2.0
135 stars 46 forks source link

fifastats spider not working properly #2

Open gerfigo opened 5 years ago

gerfigo commented 5 years ago

Dear Bradley, I have an issue that the fifastats spider is not working properly. I checked and the problem is in the parse_player method, where I get IndexError index out of range when gathering name parameter. I checked the script and until that point it works properly it seems that the response html structure have changed or something like that happened since the name is empty. Could you validate that this is the problem, please? Impressed by your work, Greg

BradleyGrantham commented 5 years ago

Hi Greg. Sorry about this. You are correct, the response has changed since when I orginally made it! Not sure when I will be able to look at it unfortunately! Bradley

randomm commented 5 years ago

The page markup must have changed completely. Scraper code makes no sense. What is worse, the new markup does not have good class hooks for Scapy to use. I'm looking into if its easily fixable...

javpascal commented 5 years ago

Hi @randomm and @BradleyGrantham I updated the scraper to be able to scrape the new website. Happy to share the new code, also parsing the details of the players' scores (not only the overall one).

BradleyGrantham commented 5 years ago

@javpascal That sounds great, is the code on GitHub? You're also more than welcome to PR it into here if you want, completely up to you though

randomm commented 5 years ago

That sounds great @javpascal, well done! ... why don't you do a pull request here?

javpascal commented 5 years ago

Hi @BradleyGrantham see attached the updated crawler - it probably will be easier for you to reuse this one. Please note that the outputs of the "parse_player" function now include multiple columns, instead of the overall score only. You will need to select that column specifically.

fifa_spider.py.zip

gerfigo commented 5 years ago

Dear @javpascal! Thank you for sharing. However, I couldn't manage to get it working. It says that there is a list index out of range error during scraping when dealing with the nationality_name part. Am I the only one having this issue?

travelhawk commented 5 years ago

I ran into the same issues and so I updated the scraper. I made a pull request if someone is still interested. Nevertheless it is still a lot of work to gather all the data, to have it in the right folders and to get the rest of the code running.