Squarespace / pgbedrock

Manage a Postgres cluster's roles, role memberships, schema ownership, and privileges
https://pgbedrock.readthedocs.io/en/latest/
Other
315 stars 35 forks source link

Execute SQL only with option --live #50

Open tolq15 opened 5 years ago

tolq15 commented 5 years ago

The reason behind this change: I have several databases with pretty big numbers of users, schemas and tables. At first, I've run pgbedrock with option 'generate' to generate configuration files. Then I've run pgbedrock with 'configure --check' parameter with output redirected to file. Practically all Postgres session hanged up. When I killed such sessions (using pg_terminate_backend), I found file generated by pgbedrock with pretty big number of SQL commands (up to 300.000 lines). It were commands like grant REFERENCES/TRIGGER/TRUNCATE ... and alter DEFAULTE PRIVILEGE ... TO I supposed that the problem was in 'rollback' all these command. To fix this, I tried to skip execution of these commands if parameter 'live' set to 'False'. My tests showed that this really fix the problem, and all output files were generated as expected. This change also make pgbedrock run faster with '--check' option because we do not need to execute generated SQL statements and then roll them back. I hope this change will make pgbedrock a little bit better.

coveralls commented 5 years ago

Pull Request Test Coverage Report for Build 183


Totals Coverage Status
Change from base Build 179: -0.002%
Covered Lines: 1265
Relevant Lines: 1301

💛 - Coveralls
zcmarine commented 5 years ago

The issue with this is that the changes that each pgbedrock module determines are influenced by the changes that the prior modules make. For example, the analyze_privileges module runs last and will grant, revoke, or do nothing on a table if you own it, and that ownership is based on the analyze_ownerships module that runs before it. It's because these depend on each other that everything is run in a transaction in check mode: so we get an accurate representation of the changes that will be needed.

To avoid things clogging up because of too many commands being rolled back, maybe you could run pgbedrock configure --check with only the first module (attributes), then if it looks good productionize it, then do the same step-by-step with the subsequent modules. Alternatively I could see supporting this no-transaction approach shown in this PR but only if we're running configure --check with only a single module, since otherwise we can't guarantee that the results are correct.

I'll leave this PR open in case you want to try going one of those routes.

tolq15 commented 5 years ago

Thank you, Zach.

On Friday, April 12, 2019, 9:29:06 a.m. MDT, Zach Marine <notifications@github.com> wrote:  

The issue with this is that the changes that each pgbedrock module determines are influenced by the changes that the prior modules make. For example, the analyze_privileges module runs last and will grant, revoke, or do nothing on a table if you own it, and that ownership is based on the analyze_ownerships module that runs before it. It's because these depend on each other that everything is run in a transaction in check mode: so we get an accurate representation of the changes that will be needed.

To avoid things clogging up because of too many commands being rolled back, maybe you could run pgbedrock configure --check with only the first module (attributes), then if it looks good productionize it, then do the same step-by-step with the subsequent modules. Alternatively I could see supporting this no-transaction approach shown in this PR but only if we're running configure --check with only a single module, since otherwise we can't guarantee that the results are correct.

I'll leave this PR open in case you want to try going one of those routes.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.