everypolitician-scrapers / spain_congreso_es

Details of members of the Spanish Congress from the official website congreso.es
https://morph.io/everypolitician-scrapers/spain_congreso_es
1 stars 2 forks source link

Refactor to use ScrapedPage subclasses #14

Open chrismytton opened 7 years ago

chrismytton commented 7 years ago

This changes the scraper to use ScrapedPage subclasses, which now handle stripping the session information from the url.

Notes to reviewer

You can see the scraper doing things by running the following:

VERBOSE=1 bundle exec ruby scraper.rb

This is currently missing the archiver because I've hit a couple of scraped_page_archive bugs relating to VCR (I think because the archive branch on this repo contains some non-standard cassettes, so it was a good test-case!).

Notes to merger

Set https://morph.io/everypolitician-scrapers/spain_congreso_es to auto-run once this is merged.

chrismytton commented 7 years ago

@tmtmtmtm 👀