heroku / roadmap

This is the public roadmap for Salesforce Heroku services.
190 stars 11 forks source link

[Request] Add support for pg_repack extension on Heroku Postgres #123

Open jeffblake opened 1 year ago

jeffblake commented 1 year ago

Required Terms

What service(s) is this request for?

Postgres

Tell us about what you're trying to solve. What challenges are you facing?

Title pretty much says it all. Without pg_repack, the only way to reclaim disk space is really `REINDEX INDEX CONCURRENTLY`, `VACUUM FULL`, and `pg:copy`, the latter two of which are pretty invasive and require downtime. `pg_repack` would be a big help! There is also `pg_squeeze

https://github.com/reorg/pg_repack
https://github.com/cybertec-postgresql/pg_squeeze
zmalone commented 1 year ago

This would be a huge improvement, at probably marginal cost. Right now, it's extremely hard to manage table bloat on Heroku, and other 3rd party Postgres extensions already exist on the platform.

72L commented 7 months ago

This is likely counter to Heroku's incentives.

For others here, what else can we do to reduce bloat? Is there a way to run a sequence of psql commands that would achieve something similar to what pg_repack does?

sasharevzin commented 7 months ago

This is likely counter to Heroku's incentives.

For others here, what else can we do to reduce bloat? Is there a way to run a sequence of psql commands that would achieve something similar to what pg_repack does?

You might look at this project https://github.com/shayonj/pg-osc

locofocos commented 2 months ago

We're slowly facing scaling challenges with our Postgres database. Our multi-tenant software organizes our data by tenant_id, inside a single monolithic database.

Heroku supports declarative partitioning to help with this https://devcenter.heroku.com/articles/increasing-performance-of-large-tables-using-partitioning but the schema requirements (around unique constraints, PK constraints, FKs, etc) presented numerous challenges when we tried this.

We've tried running CLUSTER (for example, CLUSTER VERBOSE products USING index_products_on_tenant_id) and it made a massive improvement to our query performance in a stale database. But CLUSTER requires a long downtime, and our application can't tolerate that.

pg_repack would allow us to run CLUSTER without a downtime. So by not supporting this, Heroku doesn't really offer the scaling we need at a competitive price point.

As it stands, AWS offers a Postgres database with support for pg_repack https://aws.amazon.com/blogs/database/remove-bloat-from-amazon-aurora-and-rds-for-postgresql-with-pg_repack/ so migrating off of Heroku seems like the only viable long-term option for my team.

jbrown-heroku commented 2 months ago

Thank you @jeffblake for raising this.

We now have pg_repack roadmapped with our new Heroku database platform we are currently building. The release is targeted for early 2025.