pganalyze / collector

pganalyze statistics collector for gathering PostgreSQL metrics and log data
https://pganalyze.com
Other
337 stars 62 forks source link

Unable to DB_USE_IAM_AUTH with aws RDS instance #378

Closed carl-reverb closed 1 year ago

carl-reverb commented 1 year ago

Hi, I've attempted to use the new IAM auth with our RDS instance but it did not work. Deployed to kubernetes via custom helm chart.

The instance in question has IAM auth enabled, and I granted the rds_iam role to the pganalyze user, however the server logs:

redacted:pganalyze@redacted:[13984]:LOG:  pam_authenticate failed: Permission denied
redacted:pganalyze@redacted:[13984]:FATAL:  PAM authentication failed for user "pganalyze"
redacted:pganalyze@redacted:[13984]:DETAIL:  Connection matched pg_hba.conf line 13: "hostssl    all             +rds_iam             all            pam"

I checked that my superuser could log in (via helper script):

~ $ samiamdb redacted redacted pganalyze redacted
psql (14.7 (Homebrew), server 12.8)
SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256, compression: off)
Type "help" for help.

redacted=> 

So I checked IAM for the right policy...

{
    "Statement": [
        {
            "Action": [
                "rds:DescribeDBInstances",
                "rds:DescribeDBClusters",
                "cloudwatch:GetMetricStatistics"
            ],
            "Effect": "Allow",
            "Resource": "*",
            "Sid": ""
        },
        {
            "Action": "logs:GetLogEvents",
            "Effect": "Allow",
            "Resource": "arn:aws:logs:*:*:log-group:RDSOSMetrics:log-stream:*",
            "Sid": ""
        },
        {
            "Action": "rds:DescribeDBParameters",
            "Effect": "Allow",
            "Resource": "arn:aws:rds:*:*:pg:*",
            "Sid": ""
        },
        {
            "Action": [
                "rds:DownloadDBLogFilePortion",
                "rds:DescribeDBLogFiles"
            ],
            "Effect": "Allow",
            "Resource": "arn:aws:rds:*:*:db:*",
            "Sid": ""
        },
        {
            "Action": "rds-db:connect",
            "Effect": "Allow",
            "Resource": "arn:aws:rds-db:us-east-1:REDACTED:dbuser:REDACTED/pganalyze",
            "Sid": ""
        }
    ],
    "Version": "2012-10-17"
}

All seems well here, and pganalyze does seem able to assume this role via the trust relationship and IRSA; verified because when I first deployed there was a bug that resulted in errors from pganalyze about being unable to assume the role. Corrected and resolved.

Next I verify that the correct image is deployed; one which should include the IAM functionality:

<snip>
    envFrom:                                                                                                                                                                                 
    - secretRef:                                                                                                                                                                             
        name: core-pganalyze-collector                                                                                                                                                       
        optional: false                                                                                                                                                                      
    image: quay.io/pganalyze/collector:v0.48.0                                                                                                                                               
<snip>

Within that secret, DB_USE_IAM_AUTH is set to "true". The source code and #368 commentary suggest this should work; and I get no errors about obtaining the token, however the IAM console indicates that "Amazon RDS IAM Authentication" was "Not accessed in the tracking period." whereas other API calls were.

I decided to instead switch back to password auth; revoked the rds_iam role from pganalyze, and supplied DB_PASSWORD in my secret. Thereafter:

I [default] Submitted compact activity snapshot successfully                                                                                                                                 
I [default] Submitted compact activity snapshot successfully                                                                                                                                 
I [default] Submitted compact logs snapshot successfully  

I'm out of ideas here, would really like to be able to switch over to IAM auth.

lfittl commented 1 year ago

@carl-reverb Thanks for reaching out on this!

It might be easier to discuss this via a support ticket (since it contains a lot of specifics about your environment, and we may ask you to run some more debug steps, etc). Could you open one through the in-app UI?

I'd be happy to update this GitHub issue with a description of the issue/solution, once we've found it.

matthieudiehr commented 1 year ago

Same issue here, did you manage to find a solution?

carl-reverb commented 1 year ago

I built a custom helm chart to deploy this and get it functioning. It's private though, but I can share the templates. It's based on the default helm chart template.

secret.yaml

apiVersion: v1
kind: Secret
metadata:
  name: {{ include "pganalyze-collector.fullname" . }}
type: Opaque
data:
  PGA_API_KEY: {{ .Values.pgAnalyze.apiKey | b64enc }}
  DB_HOST: {{ .Values.pgAnalyze.dbHost | b64enc }}
  DB_NAME: {{ .Values.pgAnalyze.dbName | b64enc }}
  DB_USERNAME: {{ .Values.pgAnalyze.dbUsername | b64enc }}
  {{- if .Values.pgAnalyze.dbUseIamAuth }}
  DB_USE_IAM_AUTH: {{ print "true" | b64enc }}
  {{- else }}
  DB_PASSWORD: {{ .Values.pgAnalyze.dbPassword | default "none" | b64enc }}
  {{- end }}

deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ include "pganalyze-collector.fullname" . }}
  labels:
    {{- include "pganalyze-collector.labels" . | nindent 4 }}
spec:
  replicas: 1
  selector:
    matchLabels:
      {{- include "pganalyze-collector.selectorLabels" . | nindent 6 }}
  template:
    metadata:
      {{- with .Values.podAnnotations }}
      annotations:
        {{- toYaml . | nindent 8 }}
      {{- end }}
      labels:
        {{- include "pganalyze-collector.labels" . | nindent 8 }}
    spec:
      {{- with .Values.imagePullSecrets }}
      imagePullSecrets:
        {{- toYaml . | nindent 8 }}
      {{- end }}
      serviceAccountName: {{ include "pganalyze-collector.serviceAccountName" . }}
      securityContext:
        {{- toYaml .Values.podSecurityContext | nindent 8 }}
      containers:
        - name: {{ .Chart.Name }}
          securityContext:
            {{- toYaml .Values.securityContext | nindent 12 }}
          image: "{{ .Values.image.repository }}:{{ .Values.image.tag | default .Chart.AppVersion }}"
          imagePullPolicy: {{ .Values.image.pullPolicy }}
          envFrom:
          - secretRef:
              name: {{ include "pganalyze-collector.fullname" . }}
              optional: false

          resources:
            {{- toYaml .Values.resources | nindent 12 }}
      {{- with .Values.nodeSelector }}
      nodeSelector:
        {{- toYaml . | nindent 8 }}
      {{- end }}
      {{- with .Values.affinity }}
      affinity:
        {{- toYaml . | nindent 8 }}
      {{- end }}
      {{- with .Values.tolerations }}
      tolerations:
        {{- toYaml . | nindent 8 }}
      {{- end }}

And our values.yaml file sets the role ARN annotation for IRSA

serviceAccount:
  annotations:
    eks.amazonaws.com/role-arn: arn:aws:iam::XXXXXXX:role/XXXXXXXX
  name: production-pganalyze

I can't recall if there was anything special besides setting up OIDC/IRSA and the role trust properly.

matthieudiehr commented 1 year ago

Thanks @carl-reverb 🤩 It helped!

I found that I needed to have DB_PORT explicitly set in the collector environment in order to have the IAM auth to work.