Investigate Heroku connection speed

hancush commented 2 months ago

Imports periodically time out due to very slow connections to Heroku. Heroku Postgres databases are colocated with many other project databases on a Postgres server. Email support, and if this issue cannot be addressed, consider migrating database to RDS.

hancush commented 1 month ago

Heroku support claims no issues on the Postgres side. Opened a ticket with GitHub: https://support.github.com/ticket/personal/0/2990677

Another option to consider is making a larger runner available for imports, though the cost is higher: https://docs.github.com/en/billing/managing-billing-for-github-actions/about-billing-for-github-actions#per-minute-rates

hancush commented 1 month ago

Quick response from GitHub support. tl;dr - Resources (and many other things) can indeed vary between runs:

Hi hannah,

Thank you for reaching out to GitHub! Yes, the available computing resources can indeed vary between runners on GitHub Actions, which could explain the variability in job performance you're seeing. Here are a few key factors that might affect the speed of your GitHub Actions jobs:

GitHub-hosted runners are virtual machines that run on shared infrastructure. This means that resource allocation (CPU, memory, disk I/O, etc.) can vary depending on the workload of the underlying hosts and how busy the GitHub Actions service is at the time of your job.
If a runner happens to be on a more heavily loaded host, your job might experience reduced performance.
Network conditions between the GitHub Actions runner and your Heroku Postgres database can vary significantly. A job running on a runner that is geographically farther from Heroku's servers or experiencing network congestion could suffer from increased latency or reduced bandwidth, leading to slower data import speeds.
Even if Heroku claims no ongoing issues, performance can still vary based on the load on your specific database instance or underlying hardware. For instance, if there are many concurrent queries or operations, it could cause slower response times for your import jobs.
The rate at which your import job executes could be influenced by limits on CPU or memory usage, or throttling mechanisms both on GitHub Actions and Heroku. For instance, if your job exceeds certain usage limits, it could face rate limiting that impacts its throughput.
Changes in the GitHub Actions runner environment or available software dependencies might also introduce variability. Different versions of tools or libraries could impact performance, although this would likely be less pronounced unless there are specific optimizations or regressions.

I hope this helps! Please let us know if you have any questions, or if we can help with anything else at the moment.

Cheers, James

fgregg commented 1 month ago

okay, so a few thoughts to explore:

we try the more powerful runners. if we did the cheapest step-up, that could be around $100 / month (assuming 3 hours per year-import)
we could do a self-hosted runner option, which we've done before and likely be cheaper than the beefier, native github runners
within github actions, we could detect that we are a slow environment, and restart the action. https://github.com/orgs/community/discussions/67654#discussioncomment-8038649
we could split the import job into smaller chunks. right now we are splitting them into year chunks, but we could split them into 6-month or 3-month chunks
we could rewrite the import so there is less over-the-network communication (more batch). the import code used to be batchier, but led to memory problems when we were running the import on a heroku instance. in a github action we could make a different memory/time tradeoff.

hancush commented 1 month ago

@fgregg I definitely think a batchier job would be the most cost effective and least complex in the long term.

datamade / openness-project-nmid

Investigate Heroku connection speed #210