participedia / api

Website and API for Participedia V3
https://participedia.net
MIT License
20 stars 14 forks source link

CSV Export rebuild #1139

Closed paninee closed 1 year ago

paninee commented 1 year ago

Since our data got larger, CSV export function got slower to process, triggering HTTP timeout. We added a quick solution by uploading CSV dump to AWS, and downloading that dump when the user clicks on the "Results CSV" button. There are 2 obvious issue with that hack,

  1. the CSV could be out of date, since we don't run the backup dump all the time. It's being run manually.
  2. the user cannot filter their download. Whatever filter the user applied cannot be applied to the CSV dump.

The scalable solution to fix the CSV export is to rebuild it to properly handle our expanded database. This feature would be similar to how regular database backup works. The task of the actual backup would happen in the background. The HTTP response that we return to the user would only mean that we received their request to process the CSV.

User flow

This is the flow we want to follow --> https://www.loom.com/share/d9a69fcdf2c0482a96ea5a418a3baa28?sid=a9920bb3-fe0c-4eb2-bdf8-612d10016106. For Participedia, this would be

  1. The user filters search results
  2. The user clicks on Results CSV
  3. The system sends a request to our backend API to kick off the export, then takes the user to a CSV Exports page (new page)
  4. CSV Exports page shows the backup ID (auto generated), Type, Requested, Finished, Download and Delete button. Very similar to Postgres backup in Loom.
  5. Display "Processing" in the Finished column, and do not show download/delete button while processing.
  6. The user clicks on Download button to download the CSV created and stored on AWS.

Backend logic

Once the user clicks on "Results CSV", the frontend calls the API to generate the CSV, the API