ft-interactive / us-elections-polltracker

2016 US presidential election polling
https://ig.ft.com/us-elections/polls
2 stars 1 forks source link

Review apps: avoid scraping on startup every time the PR changes #177

Open callumlocke opened 8 years ago

callumlocke commented 8 years ago

Currently, a review app runs the scraper every time it starts up. This is annoying if you add another small commit to the PR (e.g. a small style tweak) and have to wait for the scraper to finish before you can see the review app again.

Short term solution: modify the conditional 'SCRAPE_ON_STARTUP' logic in server/index.js, so it first checks the lastupdated date (and handles the case of the table not existing) before running the scraper. If it's run recently, don't bother.

Possibly better solution: break out the scraper into its own microservice, and change polltracker to query a JSON endpoint on that new microservice to get everything it needs. Then this app would be much faster and lighter and single-focus, and we wouldn't be migrating databases and re-scraping RCP for every little website change.

kavanagh commented 8 years ago

At the moment I think making the scraper faster is the most expedient approach.

There are costs to splitting the app too, a few off the top of my head:

kavanagh commented 8 years ago

If it really does get slow again I think a mock data set could be an option for review apps. There could be away to opt-into a mock data set or let the scraper run