Closed bezchristo closed 4 years ago
Hi and thanks a lot for this awesome scripts!!
@bezchristo I am facing this very same problem, but I'm not familiar at all with scraping, and I'm afraid that I could mess all up if I try fix it... would you be so kind to please post the corrected code?
Huge thanks in advance!!!
Hi, In the end I think I got a solution. In function scrape_match_stats, get the following two variables: won_game_left=xpath_parse(match_tree, "//table[@class='scores-table']/tbody/tr[1]/td[1]/@class")[0] won_game_right=xpath_parse(match_tree, "//table[@class='scores-table']/tbody/tr[2]/td[1]/@class")[0] Then use them to select the right winner and loser according to the position. if won_game_left=='won-game': try: winner_slug_xpath = "//div[@class='player-left-name']/a/@href" winner_slug_parsed = xpath_parse(match_tree, winner_slug_xpath) winner_slug = winner_slug_parsed[0].split('/')[4] except Exception: winner_slug='' try: loser_slug_xpath = "//div[@class='player-right-name']/a/@href" loser_slug_parsed = xpath_parse(match_tree, loser_slug_xpath) loser_slug = loser_slug_parsed[0].split('/')[4] except Exception: loser_slug = '' elif won_game_right=='won-game': try: loser_slug_xpath = "//div[@class='player-left-name']/a/@href" loser_slug_parsed = xpath_parse(match_tree, loser_slug_xpath) loser_slug = loser_slug_parsed[0].split('/')[4] except Exception: loser_slug='' try: winner_slug_xpath = "//div[@class='player-right-name']/a/@href" winner_slug_parsed = xpath_parse(match_tree, winner_slug_xpath) winner_slug = winner_slug_parsed[0].split('/')[4] except Exception: winner_slug = '' else: print('Error 45069')
@bezchristo hey man I need a favor. I am currently working on an atp project and I need 2018 and 2019 data but I do not have the expertise to scrape the data from the atp tour website. Is that something you can help me with if it's not too much of a big lift?
I believe all the match data is up for those years. What are you asking for?
@bezchristo: Hi Christo, I have revised all the python scripts and rescraped all the CSV files through the 2019 matches. In addition I updated them for Python 3. I've addressed the "left" and "right" issue by the following lines:
You can close this issue if you have no other question?
Hey man great work on the python scripts!
I have picked up an issue with the match stats though. The "scrape_match_stats" function in functions.py makes the assumption that the winner is always left. This is not always the case though.
Here is an example: stats
To get around this you can check which side has the "won-game" class which produces the checkmark next to their name. Here is the xpath for finding the class.
//table[@class='scores-table']/tbody/tr[1]/td[1]/@class