movetokube / postgres-operator

Postgres operator for Kubernetes
MIT License
167 stars 59 forks source link

Schema creation failing #121

Closed apeschel closed 1 year ago

apeschel commented 1 year ago

The schema creation still fails, on at least AWS, even after the latest patch to fix it. I realized I had made a mistake with the database connection that was used to issue the ALTER command - that might be the cause.

@hitman99 If you have any recommendations on how to debug the code on a live cluster, it should make it easier to figure out what is going on.

hitman99 commented 1 year ago

It's strage as the schema stuff was working before just fine. For debugging stuff: just run the operator locally (on IDE) and provide it kubeconfig to connect to the cluster

apeschel commented 1 year ago

Here's the related PR: https://github.com/movetokube/postgres-operator/pull/120

apeschel commented 1 year ago

@hitman99

It looks like there are more issues with the crippled superuser access on RDS. The schema creation is now failing on the permissions step. Here's an example:

2023-07-20T09:53:44.305-0700    INFO    controller_postgres     ALTER DEFAULT PRIVILEGES FOR ROLE "connectors_role" IN SCHEMA "service" GRANT SELECT ON TABLES TO "connectors-reader"        {"Request.Namespace": "element-platform", "Request.Name": "connectors-psql"}
2023-07-20T09:53:44.387-0700    ERROR   controller_postgres     Could not give connectors-reader permissions "SELECT"   {"Request.Namespace": "element-platform", "Request.Name": "connectors-psql", "error": "pq: must be member of role \"connectors_role\""}

Attempting to run the command directly against the RDS cluster confirms the same error:

postgres=> ALTER DEFAULT PRIVILEGES FOR ROLE "connectors_role" IN SCHEMA "service" GRANT SELECT ON TABLES TO "connectors-reader";
ERROR:  must be member of role "connectors_role"

I was able to track down some documentation for what commands the rds_superuser role allows:

https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/UsingWithRDS.MasterAccounts.html

CREATE ROLE, CREATE DB, PASSWORD VALID UNTIL INFINITY, CREATE EXTENSION, ALTER EXTENSION, DROP EXTENSION, CREATE TABLESPACE, ALTER <OBJECT> OWNER, CHECKPOINT, PG_CANCEL_BACKEND(), PG_TERMINATE_BACKEND(), SELECT PG_STAT_REPLICATION, EXECUTE PG_STAT_STATEMENTS_RESET(), OWN POSTGRES_FDW_HANDLER(), OWN POSTGRES_FDW_VALIDATOR(), OWN POSTGRES_FDW, EXECUTE PG_BUFFERCACHE_PAGES(), SELECT PG_BUFFERCACHE

I'm not familiar enough with PostgreSQL permissions to know if there's a way to convert the restricted command to an equivalent using the above allowed commands.

I think these are the viable options I see to choose from at the moment:

I think 2 would require adding a bunch of special case logic to handle RDS.

3 would require juggling between users to run each command.

I'm going to see if I can figure out a way to rewrite the command to get it to work as rds_superuser, but I'm not sure if it's possible. Any thoughts on which option you would prefer?

hitman99 commented 1 year ago

Are you directly using the master postgres user in the operator or some other user?

hitman99 commented 1 year ago

Could you describe your RDS and opserwtor setup? What users and roles used where? It would help me to understand what is the issue.

apeschel commented 1 year ago

@hitman99 We're just trying to use a more or less vanilla RDS deployment. We have the postgres-operator set up to use the RDS master user, which is assigned therds_superuser role.

The issue it looks like we're running into is that RDS doesn't let you access the true postgres master user with the true postgres superuser role - they only give you access to a crippled rds_superuser role instead that doesn't seem to have the ability to run some of the commands used by the postgres-operator.

The resources we have defined aren't anything special either - they're pretty basic. Here's an example:

apiVersion: db.movetokube.com/v1alpha1
kind: Postgres
metadata:
  name: connectors-psql
spec:
  database: connectors
  dropOnDelete: false
  masterRole: connectors_role
  schemas:
    - service
  # List of extensions that should be created in the database (optional)
  extensions:
    - pg_stat_statements

and the user:

apiVersion: db.movetokube.com/v1alpha1
kind: PostgresUser
metadata:
  name: connectors-user
spec:
  role: connectors
  database: connectors-psql # This references the Postgres CR
  secretName: db-psql-connectors-creds
  privileges: OWNER
apeschel commented 1 year ago

This set up works fine on Azure, but fails with AWS RDS.

hitman99 commented 1 year ago

Ok, I see what the problem is now. The last two releases were probably unnecessary as the fix was in the wrong place. With Postgres on RDS you will never get the true superuser access, as you already figured out. What is missing in the operator's code is this, when setting up owner's role: grant <ownerrole> to <adminRole>;

Could you try adding this to aws implementation of role setup and see if that helps?

apeschel commented 1 year ago

@hitman99 Will do 👍

apeschel commented 1 year ago

@hitman99 I managed to fix this issue with https://github.com/movetokube/postgres-operator/pull/122

apeschel commented 1 year ago

We can revert the other two commits, if you would like. You're right that the changes in them don't need to exist, and this is the correct fix.

hitman99 commented 1 year ago

Great to hear this. Would you mind adding reverting the changes in the same PR? Just add a separate commit undoing ehat was done since version 1.2.3

apeschel commented 1 year ago

@hitman99 updated

hitman99 commented 1 year ago

Thanks! Will release 1.2.5 today

apeschel commented 1 year ago

This is fixed with 1.2.5, thank you!