Open steven-king opened 6 years ago
Peter solved my problem, I was targeting p to create JSON when I could have backed up a couple steps and just targeted th, tr and td with Scrapy.
One problem that I was running into that had me stuck for a LONG time was that one of my players (on the women's volleyball team) didn't even have a stats page, and one of the Scrapy methods was failing because it was receiving a null value instead of a string. I added a validation step to ensure my html getting passed to the Selector method was a string. My code that finally finally all worked is here: https://github.com/elisabeth-parker/goHeels-scraping in the file called "GoHeels scrape EP (2).ipynb".
@cboliek
Your selector is empty.
html = json.loads(stats_sel.content.decode("utf-8"))["current_stats"]
#Remove this line:
stats_sel = scrapy.Selector(text=html)
#Change this line to refrence html not stats_sel:
# player_stats = stats_sel.css('.sidearm-table').xpath('string()').extract()
player_stats = html.css('.sidearm-table').xpath('string()').extract()
@elisabeth-parker That works. You can do a try statement to check for stats.
I found it hard to follow up with the codes from the example, so I tried to write my own "simpler" code. I basically wrote methods to pull out data and then called the methods in a dictionary. However, I got my columns and data by separate methods and I can't find a way to zip them together. Is there a way to do that in this point or should I just change me codes completely? https://github.com/aryaswanie/data/blob/master/Women's%20Volleyball.ipynb https://github.com/aryaswanie/data/blob/master/scraped_players.json
Please explain your problems and link to your repo of your iPython Notebook as a comment below.