world-federation-of-advertisers / cross-media-measurement

A privacy centric system for cross publisher, cross media ads measurement through secure multiparty computations.
https://halo.wfanet.org/
Apache License 2.0
36 stars 11 forks source link

EKS Duchy internal server periodically loses Postgres access #1648

Open SanjayVas opened 5 months ago

SanjayVas commented 5 months ago
io.r2dbc.postgresql.ExceptionFactory$PostgresqlAuthenticationFailure: [28P01] password authentication failed for user "postgres"

According to @YuhongWang-Amazon, this is related to a password auto-rotation feature which results in a new K8s Secret being generated. As the Halo update process assumes that all K8s objects are only updated on apply, this feature cannot be used.

SanjayVas commented 2 months ago

Reopening this as it still happens.

roaminggypsy commented 2 months ago

Bumping this as #1785 is a pending issue from last week rotation. Although #1785 is closed, it is de-duplicated against this issue.

renjiezh commented 1 month ago

Manually disabled the Postgres secret rotation in AWS Secret Manager. Keep monitor whether this issue is gone.

renjiezh commented 1 month ago

By manually trigger the rotation in AWS Secret Manager, the error reproduced. Thus disabling the rotation will solve the issue. If the issue come back again, it could be caused by the re-deployment of RDS postgres.

SanjayVas commented 1 month ago

I don't see any PR that updates the Terraform config here. Therefore, this issue cannot yet be marked closed. All AWS infra must be managed via Terraform.