nerc-project / operations

Issues related to the operation of the NERC OpenShift environment
1 stars 0 forks source link

Infra cluster externalsecrets are failing to sync #633

Open dystewart opened 2 days ago

dystewart commented 2 days ago

All externalsecrets in the infra cluster have error status: SecretSyncedError

dystewart commented 2 days ago

Checked on the secretstore resource:

$ oc get secretstore --all-namespaces
No resources found
dystewart commented 2 days ago

Looking at the clustersecretstore:

$ oc get clustersecretstore
NAME                   AGE    STATUS                  CAPABILITIES   READY
nerc-cluster-secrets   525d   InvalidProviderConfig   ReadWrite      False
dystewart commented 2 days ago
$ oc describe clustersecretstore nerc-cluster-secrets
...
Status:
  Capabilities:  ReadWrite
  Conditions:
    Last Transition Time:  2024-06-14T17:41:50Z
    Message:               unable to create client
    Reason:                InvalidProviderConfig
    Status:                False
    Type:                  Ready
Events:
  Type     Reason                 Age                  From                  Message
  ----     ------                 ----                 ----                  -------
  Warning  InvalidProviderConfig  17m (x994 over 11d)  cluster-secret-store  unable to log in to auth method: unable to log in with Kubernetes auth: Error making API request.

URL: PUT https://vault.vault.svc.cluster.local:8200/v1/auth/kubernetes/nerc-ocp-infra/login
Code: 403. Errors:

* permission denied
  Warning  InvalidProviderConfig  80s  cluster-secret-store  unable to log in to auth method: unable to log in with Kubernetes auth: read tcp 10.131.0.23:48858->172.30.157.162:8200: read: connection reset by peer
  Warning  InvalidProviderConfig  50s  cluster-secret-store  unable to log in to auth method: unable to log in with Kubernetes auth: read tcp 10.131.0.23:43228->172.30.157.162:8200: read: connection reset by peer
dystewart commented 2 days ago

Here is a couple lines from the vault-0 pod logs:

2024-06-29T02:06:36.850Z [INFO] expiration: revoked lease: lease_id=auth/kubernetes/nerc-ocp-obs/login/h00849f8fd2d6d0eda3183a0c6393364ecd6f76e9cad0f85775647000f74f454b
2024-06-29T02:06:37.383Z [INFO] expiration: revoked lease: lease_id=auth/kubernetes/nerc-ocp-prod/login/hbec4d90d03e01457856f3d342649fb814a01cb3554df3fb52793a302a0d53346
2024-06-29T02:06:42.214Z [INFO] expiration: revoked lease: lease_id=auth/kubernetes/nerc-ocp-test/login/hf30e7433e1521929e2adfd41aaea636a564bb4fe068c14b03c0e68d37de34b3d

There are thousands of lines of this info log from test,obs,prod