oauth-proxy fails with perms error after OpenShift upgrade.

bwplotka commented 3 years ago

We have staging incident (see https://issues.redhat.com/browse/MON-1861), which makes oauth-proxy fail requests with error, even though, nothing change other than K8s upgrade. Any pointer from oauth-proxy community?

oauth-proxy version: image: 'quay.io/openshift/origin-oauth-proxy:4.9.0'

K8s version:

oc v3.11.0+0cbc58b
kubernetes v1.11.0+d4cacc0 (client)
kubernetes v1.21.1+9807387 (server)

Error:

E0906 11:03:42.025788       1 reflector.go:127] github.com/openshift/oauth-proxy/providers/openshift/provider.go:347: Failed to watch *v1.ConfigMap: failed to list *v1.ConfigMap: configmaps "oauth-serving-cert" is forbidden: User "system:serviceaccount:telemeter-stage:prometheus-telemeter" cannot list resource "configmaps" in API group "" in the namespace "openshift-config-managed"

s-urbaniak commented 3 years ago

@bwplotka can you paste the full set of command line parameters you used for the oauth-proxy instance?

bwplotka commented 3 years ago

 - '-provider=openshift'
            - '-https-address=:8443'
            - '-http-address='
            - '-email-domain=*'
            - '-upstream=http://localhost:8080'
            - '-openshift-service-account=prometheus-telemeter'
            - >-
              -openshift-sar={"resource": "namespaces", "verb": "get", "name":
              "telemeter-stage", "namespace": "telemeter-stage"}
            - >-
              -openshift-delegate-urls={"/": {"resource": "namespaces", "verb":
              "get", "name": "telemeter-stage", "namespace": "telemeter-stage"}}
            - '-tls-cert=/etc/tls/private/tls.crt'
            - '-tls-key=/etc/tls/private/tls.key'
            - >-
              -client-secret-file=/var/run/secrets/kubernetes.io/serviceaccount/token
            - '-cookie-secret-file=/etc/proxy/secrets/session_secret'
            - '-openshift-ca=/etc/pki/tls/cert.pem'
            - '-openshift-ca=/var/run/secrets/kubernetes.io/serviceaccount/ca.crt'
      serviceAccount: prometheus-telemeter

s-urbaniak commented 3 years ago

As part of https://github.com/openshift/oauth-proxy/pull/220/commits/c1ac8c085d50ee2cce7b95a3de1c3e35a347ba2d oauth-proxy (https://github.com/openshift/oauth-proxy/pull/220) we need to read configmaps from the openshift-config-managed namespace.

The service-account that runs oauth-proxy must have permission to read configmaps from it.

bwplotka commented 3 years ago

We reverted for now.

I think we need to update docs at least about what permissions are required to run oauth-proxy now. I assume when we reverted, we essentially lose a bit of security? What this #220 is about?

Incident started exactly on newer openshift version: Heads up @app-sre-stage-01-cluster! cluster app-sre-stage-01 is currently being upgraded to version 4.8.10

bwplotka commented 3 years ago

This happened again on production. I don't think anyone has those perms: https://issues.redhat.com/browse/MON-1888

bwplotka commented 3 years ago

We confirm it's just flaky, sometimes 4.9 works, sometimes not. I think there are no permission for sure, so maybe it only after some time fetches this weird configmap for the namespace?

k-wall commented 2 years ago

We too are seeing reports of this problem https://issues.redhat.com/browse/ENTMQMAAS-2766. Is the recommended resolution documented someplace?

s-urbaniak commented 2 years ago

@k-wall as mentioned in https://github.com/openshift/oauth-proxy/issues/229#issuecomment-913707942:

The service-account that runs oauth-proxy must have permission to read configmaps from the openshift-config-managed namespace.

sastorsl commented 2 years ago

Created a ClusterRole to read configmap and a RoleBinding in the namespace for, in my case, grafana-operator. The next error is then

2021/12/15 15:43:20 oauthproxy.go:445: ErrorPage 500 Internal Error configmap "oauth-serving-cert" not found

I'm assuming this is related to changes in OpenShift 4.9 - https://docs.openshift.com/container-platform/4.9/authentication/configuring-internal-oauth.html which references the oauth-serving-cert.

Reverting to openshift/origin-oauth-proxy:4.8 resolved that issue.

ClusterRole

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: grafana-proxy-configmap
rules:
  - apiGroups:
    - ""
    resources:
    - configmaps
    verbs:
    - get
    - list
    - watch

RoleBinding:

apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: grafana-proxy-configmap
  namespace: openshift-config-managed
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: grafana-proxy-configmap
subjects:
- kind: ServiceAccount
  name: grafana-serviceaccount
  namespace: grafana-operator

openshift / oauth-proxy

oauth-proxy fails with perms error after OpenShift upgrade. #229

ClusterRole

RoleBinding: