nprapps / elections18-general

2018 midterm election back-end: Associated Press data ETL, database, admin panel, and JSON output; iteration upon 2016 GE work
MIT License
3 stars 1 forks source link

Rearchitect race loading to improve system speed #4

Open mileswwatkins opened 6 years ago

mileswwatkins commented 6 years ago

This is the lead-in to some performance-improvement work. On my local machine, I'm seeing render times around 18 seconds. We could make calls to the AP API as frequently as every 6 seconds! This would make our graphics slightly more timely (max performance improvement would maybe be 10 seconds), which would help with competition.

First step is to do a quick breakdown (using debug logging) of how long each step in the process takes.

cc @lindamood

mileswwatkins commented 6 years ago

Talked with the 2016 maintainers, and it seems like this slow-ish pacing is due to the fact that all races are fetched, ingested, and posted to S3 at each interval; it's likely that the DB-load of the CSV is the largest slow-down.

There are ways to handle this (by using a subset of data; only information that's been updated since last call), but they'd require some good rearchitecting. (Before rearchitecting, we'd need to implement a very good test harness.)