vishaalagartha / basketball_reference_scraper

A python module for scraping static and dynamic content from Basketball Reference.
MIT License
254 stars 91 forks source link

can't get data on non Latin names like Nikola Jokić #29

Closed gilad235 closed 3 years ago

gilad235 commented 3 years ago

code works perfectly fine on Latin names, but when I try to get names with ć, the code fails I tried to get the data on "Nikola Jokic" and after that returned empty, I tried to get the data with the name saved at the same data base. than the code failed. code fails in line 29 in the file utils.py this is my code:

d = get_box_scores('2020-01-06', 'DEN', 'ATL')
niko = d['DEN']['PLAYER'][0]
temp = get_game_logs(niko, "2019-10-10", "2020-10-10", playoffs=False)
giasemidis commented 3 years ago

I had the issue, found the solution, the issues is in get_player_suffix function in utils.py, in line 30: if unicodedata.normalize('NFD', anchor.text).encode('ascii', 'ignore').decode("utf-8").lower() in name.lower(): it should be if unicodedata.normalize('NFD', anchor.text).encode('ascii', 'ignore').decode("utf-8").lower() in normalized_name.lower():

Then it works like a charm.

gilad235 commented 3 years ago

amazing thanks