BruceJohnJennerLawso / scrap

Hockey stats analysis done by scraping the data to a csv file, then processing/analyzing them with more python.
3 stars 0 forks source link

Different Players with the same name #53

Open BruceJohnJennerLawso opened 7 years ago

BruceJohnJennerLawso commented 7 years ago

Case in point can be seen by running

python playerSummary.py nhl "Taylor Hall"

Which wouldnt be that big of a deal, except that the nhl player object counts both 80s hall and 2010s Taylor Hall seasons together for the player career summaries.

Also, I think theres a pair of Greg Adams on the 90s Canucks, and probably more cases of multiple-players-to-a-name

BruceJohnJennerLawso commented 7 years ago

A decent solution for this would be to write a script that manually modifies the data scraped right after the scraper runs and manually changes the name something like

"Taylor Hall"

to

"Taylor Hall (born 1991)"

and

"Taylor Hall (born 1964)"

BruceJohnJennerLawso commented 7 years ago

This would be a good solution because it should allow for keeping up with any future changes to how the scraper works that might require updates to the csv files

BruceJohnJennerLawso commented 7 years ago

Maybe even easier to do, just tweak the scraper to save the player id that hockey reference uses, then save it right after the players full name, ie

http://www.hockey-reference.com/players/h/hallta02.html

becomes Taylor Hall,hallta02

and a different one

Taylor Hall,hallta01

BruceJohnJennerLawso commented 7 years ago

as mentioned in #93, #109 just made fixing this a hell of a lot harder