BruceJohnJennerLawso / scrap

Hockey stats analysis done by scraping the data to a csv file, then processing/analyzing them with more python.
3 stars 0 forks source link

Make the scraper(s) 100% free of manual workarounds #93

Open BruceJohnJennerLawso opened 7 years ago

BruceJohnJennerLawso commented 7 years ago

This is a big one, almost a mini-project in its own right.

Too many csvs have been hand edited to fix errors or mistakes where the scraper was lacking (ie the manual header row fixes for season, playoffs, the entire csv for the 1918 wanderers, etc.)

Really need to fix this, potentially by doing a full rewrite of both scrapers in beautiful soup, so as to eliminate the hacky workarounds required to deal with empty spaces in tables, etc, etc.

BruceJohnJennerLawso commented 7 years ago

Also the manual header fixes for WHA guest teams

BruceJohnJennerLawso commented 7 years ago

Simplest way of doing this is probably to get master as stable as possible, make a branch, and repeatedly scrape the data and test until everything is working in one scrape->run step

BruceJohnJennerLawso commented 7 years ago

109 just made this a hell of a lot harder