OpenUnison / openunison-k8s

Access portal for Kubernetes
Apache License 2.0
105 stars 5 forks source link

Required secret for oidc-proxy-orchestra found missing and responsible openunison-operator is not creating it. #61

Closed shnigam2 closed 1 year ago

shnigam2 commented 1 year ago

Please let us know what is the default behaviour and how secret got missing and created through openunison-operator

mlbiam commented 1 year ago

If the operator isn't creating the secret, please provide the logs from the operator.

shnigam2 commented 1 year ago

Hi Marc,

Thanks for replying, please find the logs from openunison-operator pod:-

k logs openunison-operator-cfd9f7847-k6mb7 -c openunison-operator -n openunison Using version 'openunison.tremolo.io/v6' https://xxx.xx.x.x:443/apis/openunison.tremolo.io/v6/namespaces/openunison/openunisons?watch=true&timeoutSeconds=30&resourceVersion=38933394 Watch failed : {"type":"ERROR","object":{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"too old resource version: 38933394 (41061566)","reason":"Expired","code":410}} https://xxx.xx.x.x:443/apis/openunison.tremolo.io/v6/namespaces/openunison/openunisons?watch=true&timeoutSeconds=30 Resource 38933394 has already been processed, skipping https://xxx.xx.x.x:443/apis/openunison.tremolo.io/v6/namespaces/openunison/openunisons?watch=true&timeoutSeconds=30 Resource 38933394 has already been processed, skipping https://xxx.xx.x.x:443/apis/openunison.tremolo.io/v6/namespaces/openunison/openunisons?watch=true&timeoutSeconds=30 Resource 38933394 has already been processed, skipping

mlbiam commented 1 year ago

Odd. What distro of kubernetes are you using? Try this:

  1. Delete the openunison-operator pod
  2. Add an annotation to the orchestra openunison object in the openunison Namespace.

This will generate fresh deployment logs and should tell us why the operator isn't generating the correct secrets

shnigam2 commented 1 year ago

Hi Marc,

We have EKS and Kubeadm based clusters running on EC2. What annotation you want me to add in openUnison orchestra object.

Currently we have only last applied configuration in annotation: Name: orchestra Namespace: openunison Labels: argocd.argoproj.io/instance=orchestra Annotations: API Version: openunison.tremolo.io/v6 Kind: OpenUnison Metadata: Creation Timestamp: 2022-12-06T06:11:53Z Generation: 1 Managed Fields: API Version: openunison.tremolo.io/v5 Fields Type: FieldsV1 fieldsV1: f:metadata: f:annotations: .: f:kubectl.kubernetes.io/last-applied-configuration: f:labels: .:

mlbiam commented 1 year ago

You can add any annotation. It's just to trigger the operator to run again.

shnigam2 commented 1 year ago

Hi Marc,

I followed the below and secret created :-

  1. Deleted the operator pod
  2. Edit the openUnison orchestra object by adding annotation

k get secret -n openunison|grep -i unison-tls k edit openUnison orchestra -n openunison
openunison.openunison.tremolo.io/orchestra edited k get secret -n openunison|grep -i unison-tls k get secret -n openunison|grep -i unison-tls unison-tls kubernetes.io/tls 2 0s

Some important pick from operator pods logs as below :- ————————— Loading Script : '/usr/local/openunison/js/helpers.js' Loading Script : '/usr/local/openunison/js/deploy-openshift.js' Loading Script : '/usr/local/openunison/js/deploy-objs.js' Loading Script : '/usr/local/openunison/js/operator.js' Loading Script : '/usr/local/openunison/js/globals.js' Loading Script : '/usr/local/openunison/js/deploy-upstream-k8s.js' Invoking javascript —————————— Processing key 'unison-tls' Checking if kubernetes secret exists Creating keypair Creating secret Posting secret Storing to keystore Key 'unison-tls' finished 0 1

Processing key 'kubernetes-dashboard' Checking if kubernetes secret exists Secret exists Adding existing secret to keystore Storing just the certificate3 1 2

Processing key 'unison-saml2-rp-sig' Checking if kubernetes secret exists Secret exists Adding existing secret to keystore Storing to keystore 2 —— Problem calling '/api/v1/namespaces/openunison/secrets/amq-secrets-orchestra' - 404 {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"secrets \"amq-secrets-orchestra\" not found","reason":"NotFound","details":{"name":"amq-secrets-orchestra","kind":"secrets"},"code":404}

Obj '/api/v1/namespaces/openunison/secrets/amq-secrets-orchestra' doesn't exist, skipping Problem calling '/api/v1/namespaces/openunison/secrets/amq-env-secrets-orchestra' - 404 {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"secrets \"amq-env-secrets-orchestra\" not found","reason":"NotFound","details":{"name":"amq-env-secrets-orchestra","kind":"secrets"},"code":404}

Obj '/api/v1/namespaces/openunison/secrets/amq-env-secrets-orchestra' doesn't exist, skipping Problem calling '/api/v1/namespaces/openunison/services/amq' - 404 {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"services \"amq\" not found","reason":"NotFound","details":{"name":"amq","kind":"services"},"code":404}

Obj '/api/v1/namespaces/openunison/services/amq' doesn't exist, skipping Problem calling '/apis/apps/v1/namespaces/openunison/deployments/amq-orchestra' - 404 {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"deployments.apps \"amq-orchestra\" not found","reason":"NotFound","details":{"name":"amq-orchestra","group":"apps","kind":"deployments"},"code":404}

Obj '/apis/apps/v1/namespaces/openunison/deployments/amq-orchestra' doesn't exist, skipping Problem calling '/apis/networking.k8s.io/v1/namespaces/openunison/ingresses/openunison-orchestra' - 404 {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"ingresses.networking.k8s.io \"openunison-orchestra\" not found","reason":"NotFound","details":{"name":"openunison-orchestra","group":"networking.k8s.io","kind":"ingresses"},"code":404}

unknown ingress type Not patching the job

mlbiam commented 1 year ago

Can you post your values.yaml? It looks like provisioning is enabled?

Also, can you run helm list -n openunison?

shnigam2 commented 1 year ago

Hi Marc, Please find the values.yaml content : -

If set to true the CRDs will be deployed. Otherwise the CRs are ignored.

crd: deploy: true betas: false webhooks: true

services: node_selectors: [] pullSecret: ""

image: docker.io/tremolosecurity/openunison-k8s-operator:latest

For helm list -n openunison as we are managing it through argocd we cannot see any output for that command helm list -n openunison NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION

shnigam2 commented 1 year ago

Hi Marc,

Can we have an update on this..? We found unison-tls is missing and not creating on its own through operator. Can we know the flow of unison-tls creation if it is missing and what configuration is reponsible for recreating it. And is there any scenario when this unison-tls get deleted. And what is the role of cronjobs which is a part of openunison namespace. And also please let us know if we connect by any way to troubleshoot this.

Regards Shobhit

mlbiam commented 1 year ago

Can we have an update on this..?

I don't see your values.yaml posted. Only a small snippet. Please post the entire things.

We found unison-tls is missing and not creating on its own through operator. Can we know the flow of unison-tls creation if it is missing and what configuration is reponsible for recreating it.

I need to see your values.yaml before I can debug.

we are managing it through argocd we cannot see any output for that command

Understood. We do have a chart specific for ArgoCD that combines the charts and relies on waves to deploy manifests in the correct order. It's not officially "released" but I have it running at a couple of customers. I'm guessing you have multiple Application objects setup? If so, disable them/delete them and instead create a single Application object with this one (updating with your own values.yaml):

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: openunison
  namespace: argocd
spec:
  project: default
  source:
    repoURL: 'https://nexus.tremolo.io/repository/helm'
    targetRevision: 2.3.14
    helm:
      values: |-
        network:
          openunison_host: "k8s-ou.domain.dev"
          dashboard_host: "k8s-db.domain.dev"
          api_server_host: "k8s-api.domain.dev"
          session_inactivity_timeout_seconds: 900
          k8s_url: ""
          force_redirect_to_tls: false
          createIngressCertificate: false
          ingress_type: nginx
          ingress_annotations:
            cert-manager.io/cluster-issuer: ca-issuer
            kubernetes.io/ingress.class: nginx-internet

        cert_template:
          ou: "Kubernetes"
          o: "MyOrg"
          l: "My Cluster"
          st: "State of Cluster"
          c: "MyCountry"

        image: docker.io/tremolosecurity/openunison-k8s
        myvd_config_path: "WEB-INF/myvd.conf"
        k8s_cluster_name: some-cluster
        enable_impersonation: true

        impersonation:
          use_jetstack: true
          jetstack_oidc_proxy_image: docker.io/tremolosecurity/kube-oidc-proxy:latest
          explicit_certificate_trust: false

        dashboard:
          namespace: "kubernetes-dashboard"
          cert_name: "kubernetes-dashboard-certs"
          label: "k8s-app=kubernetes-dashboard"
          service_name: kubernetes-dashboard
          require_session: true
          enabled: true

        certs:
          use_k8s_cm: false

        trusted_certs: []

        monitoring:
          prometheus_service_account: system:serviceaccount:monitoring:prometheus-k8s

        oidc:
          client_id: my-client-id
          issuer: https://issuer.domain.dev
          user_in_idtoken: false
          domain: ""
          scopes: openid email profile groups
          claims:
            sub: sub
            email: email
            given_name: given_name
            family_name: family_name
            display_name: name
            groups: groups

        network_policies:
          enabled: false
          ingress:
            enabled: true
            labels:
              app.kubernetes.io/name: ingress-nginx
          monitoring:
            enabled: true
            labels:
              app.kubernetes.io/name: monitoring
          apiserver:
            enabled: false
            labels:
              app.kubernetes.io/name: kube-system

        services:
          enable_tokenrequest: false
          token_request_audience: api
          token_request_expiration_seconds: 600
          node_selectors: []

        openunison:
          replicas: 1
          non_secret_data:
            K8S_DB_SSO: saml2
            PROMETHEUS_SERVICE_ACCOUNT: system:serviceaccount:monitoring:prometheus-k8s
            SHOW_PORTAL_ORGS: "false"
            #openunison.static-secret.skip_write: "true"
            #openunison.static-secret.suffix: "-sync"

          secrets: []
          html:
            image: docker.io/tremolosecurity/openunison-k8s-html
          enable_provisioning: false
    chart: orchestra-login-portal-argocd
  destination:
    server: 'https://kubernetes.default.svc'
    namespace: openunison

And is there any scenario when this unison-tls get deleted.

if the orchestra openunison object is deleted, the operator will delete all the objects it created. Including the unison-tls Secret

And what is the role of cronjobs which is a part of openunison namespace.

The CronJob is responsible for regenerating certificates generated by the operator.

And also please let us know if we connect by any way to troubleshoot this.

We reserve direct support for our commercial customers. We require our open source customers to request support through GitHub issues so that way if another user runs into the same problem they'll be better able to find a solution. We don't have any "open core" or "enterprise" functionality that requires a commercial support contract, so our business is built on providing direct support, SLAs, etc. If you're interested, please reach out to us at https://www.tremolosecurity.com/contact/contact-us

Nikhil-Pallavali commented 1 year ago

Hi @mlbiam

We are using mutiple Application objects for openunison and orchestra. Below are the ArgoCD objects with helm values.

Openunison

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  annotations:
    argocd.argoproj.io/sync-wave: '-2'
  name: openunison
  namespace: argocd
spec:
  destination:
    namespace: openunison
    server: 'https://xxxxxxxx'
  project: cnt
  source:
    chart: openunison-operator
    helm:
      releaseName: openunison
      values: |-
        {
          "image": "xxxxxxxx/openunison-k8s-operator:xxxxxxxx",
          "services": {
            "pullSecret": "jfrog-auth"
          }
        }
    repoURL: 'https://nexus.tremolo.io/repository/helm/'
    targetRevision: 2.0.6
  syncPolicy:
    automated:
      prune: true
    syncOptions:
      - ApplyOutOfSyncOnly=true

Orchestra

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: orchestra
  namespace: argocd
spec:
  destination:
    namespace: openunison
    server: 'https://xxxxxxxx'
  project: cnt
  source:
    chart: openunison-k8s-login-oidc
    helm:
      releaseName: orchestra
      values: |-
        {
          "cert_template": {
            "c": "xxxxxxxx",
            "l": "xxxxxxxx",
            "o": "dev",
            "ou": "xxxxxxxx",
            "st": "xxxxxxxx"
          },
          "deployment_data": {
            "pull_secret": "jfrog-auth"
          },
          "enable_impersonation": true,
          "image": "xxxxxxxx/openunison-k8s-login-oidc:xxxxxxxx",
          "impersonation": {
            "ca_secret_name": "xxxxxxxx",
            "explicit_certificate_trust": true,
            "jetstack_oidc_proxy_image": "xxxxxxxx/kube-oidc-proxy:xxxxxxxx",
            "oidc_tls_secret_name": "tls-certificate",
            "use_jetstack": true
          },
          "k8s_cluster_name": "xxxxxxxx",
          "myvd_configmap": "",
          "network": {
            "api_server_host": "dev-ou-api.com",
            "createIngressCertificate": false,
            "dashboard_host": "dev-dashboard.com",
            "ingress_annotations": {
              "certmanager.k8s.io/cluster-issuer": "letsencrypt",
              "kubernetes.io/ingress.class": "openunison"
            },
            "ingress_certificate": "",
            "ingress_type": "none",
            "k8s_url": "",
            "openunison_host": "dev-login.com",
            "session_inactivity_timeout_seconds": xxxxxxxx
          },
          "oidc": {
            "auth_url": "https://xxxxxxxx",
            "client_id": "xxxxxxxx",
            "token_url": "https://xxxxxxxx",
            "user_in_idtoken": xxxxxxxx,
            "userinfo_url": "https://xxxxxxxx"
          },
          "openunison": {
            "replicas": 2
          },
          "services": {
            "pullSecret": "jfrog-auth",
            "resources": {
              "limits": {
                "cpu": "500m",
                "memory": "2048Mi"
              },
              "requests": {
                "cpu": "200m",
                "memory": "1024Mi"
              }
            },
            "token_request_expiration_seconds": xxxxxxxx
          },
          "trusted_certs": [
            {
              "name": "xxxxxxxx",
              "pem_b64": "xxxxxxxx"
            }
          ]
        }
    repoURL: 'https://nexus.tremolo.io/repository/helm/'
    targetRevision: 1.0.24
  syncPolicy:
    automated:
      prune: true
    syncOptions:
      - ApplyOutOfSyncOnly=true

Could you pls take a look at this and debug further?

mlbiam commented 1 year ago

First, your revisions are pretty old. I would update them to the latest to make sure everything is working off of the same charts. Second, you can set impersonation.explicit_certificate_trust to false since you're using let's encrypt as your issuer (I don't think this is your problem, but you mentioned oidc-proxy not starting).

Once you update and get the latest containers, re-sync your operator Application, then your orchestra Application (I would also update the Application that points to the orchestra-login-portal deployment too).

mlbiam commented 1 year ago

I'm reading through this, it looks like you're mixing charts built for openunison-k8s-login-oidc (which is going to be end of life at the end of the month) with the openunison-k8s container, which won't work. The older charts assumed that the openunison configuration was in the container, where as the openunison-k8s container assumes everything is loaded dynamically via CRs. That's probably why you're having so many issues.

That said, I'd delete both the openunison and openunison-orchestra Application objects so you can start fresh (since OpenUnison doesn't store any data or state this is safe. Then:

  1. Create the openunison namespace
  2. Create the orchestra-secrets-source Secret with K8S_DB_SECRET, unisonKeystorePassword, and OIDC_CLIENT_SECRET
  3. Update the below Application object appropriately (I integrated your values as best I could, but you should be able to take it from here), deploy and sync
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: openunison
  namespace: argocd
spec:
  project: default
  ignoreDifferences:
  - group: "admissionregistration.k8s.io"
    kind: "ValidatingWebhookConfiguration"
    jsonPointers:
    - /webhooks/0/clientConfig/caBundle
    - /webhooks/1/clientConfig/caBundle
    - /webhooks/2/clientConfig/caBundle
    - /webhooks/3/clientConfig/caBundle
    - /webhooks/4/clientConfig/caBundle
  syncPolicy:
    syncOptions:
    - RespectIgnoreDifferences=true
  source:
    repoURL: 'https://nexus.tremolo.io/repository/helm-betas'
    targetRevision: 2.3.15
    helm:
      values: |-
        {
          "cert_template": {
            "c": "xxxxxxxx",
            "l": "xxxxxxxx",
            "o": "dev",
            "ou": "xxxxxxxx",
            "st": "xxxxxxxx"
          },
          "enable_impersonation": true,
          "image": "xxxxxxxx/openunison-k8s:xxxxxxxx",
          "impersonation": {
            "ca_secret_name": "xxxxxxxx",
            "explicit_certificate_trust": true,
            "jetstack_oidc_proxy_image": "xxxxxxxx/kube-oidc-proxy:xxxxxxxx",
            "oidc_tls_secret_name": "tls-certificate",
            "use_jetstack": true
          },
          "k8s_cluster_name": "xxxxxxxx",
          "myvd_configmap": "",
          "network": {
            "api_server_host": "dev-ou-api.com",
            "createIngressCertificate": false,
            "dashboard_host": "dev-dashboard.com",
            "ingress_annotations": {
              "certmanager.k8s.io/cluster-issuer": "letsencrypt",
              "kubernetes.io/ingress.class": "openunison"
            },
            "ingress_certificate": "",
            "ingress_type": "none",
            "k8s_url": "",
            "openunison_host": "dev-login.com",
            "session_inactivity_timeout_seconds": xxxxxxxx
          },
          "oidc": {
            "auth_url": "https://xxxxxxxx",
            "client_id": "xxxxxxxx",
            "token_url": "https://xxxxxxxx",
            "user_in_idtoken": xxxxxxxx,
            "userinfo_url": "https://xxxxxxxx"
          },
          "openunison": {
            "replicas": 2
          },
          "services": {
            "pullSecret": "jfrog-auth",
            "resources": {
              "limits": {
                "cpu": "500m",
                "memory": "2048Mi"
              },
              "requests": {
                "cpu": "200m",
                "memory": "1024Mi"
              }
            },
            "token_request_expiration_seconds": xxxxxxxx
          },
          "trusted_certs": [
            {
              "name": "xxxxxxxx",
              "pem_b64": "xxxxxxxx"
            }
          ],
          "operator": {
            "image":"xxxxxxxx/openunison-k8s-operator:xxxxxxxx"
          }
        }
    chart: orchestra-login-portal-argocd
  destination:
    server: 'https://kubernetes.default.svc'
    namespace: openunison

The only change from what you provided above was:

  1. added the operator section of the values to tell the charts where to pull the operator image from
  2. changed image to point to the openunison-k8s image

It looks like you're importing images before running them. Here's the link that describes which containers need to be pulled in - https://openunison.github.io/knowledgebase/airgap/ (you can ignore the AMQ images)

Once you sync the openunison Application you should have a running OpenUnison

shnigam2 commented 1 year ago

Hi Marc,

Thanks for an update, I request if you could also check our CR for openUnison object. As in key_pair for unison-tls we are not using tls_secret_name and also want to know is there by any chance any scenario when openunison orchestra object was not deleted and only secret unison-tls was found missing. Please have a look in openunison object CR yaml

apiVersion: openunison.tremolo.io/v6
kind: OpenUnison
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"openunison.tremolo.io/v5","kind":"OpenUnison","metadata":{"annotations":{},"labels":{"argocd.argoproj.io/instance":"orchestra"},"name":"orchestra","namespace":"openunison"},"spec":{"deployment_data":{"liveness_probe_command":["/usr/local/openunison/bin/check_alive.py"],"node_selectors":[],"pull_secret":"jfrog-auth","readiness_probe_command":["/usr/local/openunison/bin/check_alive.py","https://127.0.0.1:8443/auth/idp/k8sIdp/.well-known/openid-configuration","issuer"],"resources":{"limits":{"cpu":"500m","memory":"2048Mi"},"requests":{"cpu":"200m","memory":"1024Mi"}},"tokenrequest_api":{"audience":"api","enabled":false,"expirationSeconds":14400}},"dest_secret":"orchestra","enable_activemq":false,"hosts":[{"annotations":[{"name":"certmanager.k8s.io/cluster-issuer","value":"letsencrypt"},{"name":"kubernetes.io/ingress.class","value":"openunison"}],"ingress_name":"openunison","ingress_type":"none","names":[{"env_var":"OU_HOST","name":"login-cluster-test-us-east-1-aws.cf.platform.domain.cloud"},{"env_var":"K8S_DASHBOARD_HOST","name":"dashboard-cluster-test-us-east-1-aws.cf.platform.domain.cloud"},{"env_var":"K8S_API_HOST","name":"ou-api-cluster-test-us-east-1-aws.cf.platform.domain.cloud","service_name":"kube-oidc-proxy-orchestra"}],"secret_name":"ou-tls-certificate"}],"image":"xxxxxxxxxxx/openunison-k8s-login-oidc:6e2748ab663d4dd1a2f0039278e05decf8adea5135be16cd0dabedd1946076e4","key_store":{"key_pairs":{"create_keypair_template":[{"name":"ou","value":"CLUSTER Test"},{"name":"o","value":"Test"},{"name":"l","value":"CLUSTER"},{"name":"st","value":"North Virginia"},{"name":"c","value":"US"}],"keys":[{"create_data":{"ca_cert":true,"key_size":2048,"server_name":"openunison-orchestra.openunison.svc","sign_by_k8s_ca":false,"subject_alternative_names":["ou-api-cluster-test-us-east-1-aws.cf.platform.domain.cloud"]},"import_into_ks":"keypair","name":"unison-tls"},{"create_data":{"ca_cert":true,"delete_pods_labels":["k8s-app=kubernetes-dashboard"],"key_size":2048,"secret_info":{"cert_name":"dashboard.crt","key_name":"dashboard.key","type_of_secret":"Opaque"},"server_name":"kubernetes-dashboard.kubernetes-dashboard.svc","sign_by_k8s_ca":false,"subject_alternative_names":[],"target_namespace":"kubernetes-dashboard"},"import_into_ks":"certificate","name":"kubernetes-dashboard","replace_if_exists":true,"tls_secret_name":"kubernetes-dashboard-certs"},{"create_data":{"ca_cert":true,"key_size":2048,"server_name":"unison-saml2-rp-sig","sign_by_k8s_ca":false,"subject_alternative_names":[]},"import_into_ks":"keypair","name":"unison-saml2-rp-sig"}]},"static_keys":[{"name":"session-unison","version":1},{"name":"lastmile-oidc","version":1}],"trusted_certificates":[],"update_controller":{"days_to_expire":10,"image":"docker.io/tremolosecurity/kubernetes-artifact-deployment:1.1.0","schedule":"0 2 * * *"}},"myvd_configmap":"","non_secret_data":[{"name":"K8S_URL","value":"https://ou-api-cluster-test-us-east-1-aws.cf.platform.domain.cloud"},{"name":"SESSION_INACTIVITY_TIMEOUT_SECONDS","value":"36000"},{"name":"K8S_DASHBOARD_NAMESPACE","value":"kubernetes-dashboard"},{"name":"K8S_DASHBOARD_SERVICE","value":"kubernetes-dashboard"},{"name":"K8S_CLUSTER_NAME","value":"cluster-test-us-east-1-aws.cf.platform.domain.cloud"},{"name":"K8S_IMPERSONATION","value":"true"},{"name":"PROMETHEUS_SERVICE_ACCOUNT","value":"system:serviceaccount:monitoring:prometheus-k8s"},{"name":"OIDC_CLIENT_ID","value":"0oa7xj9mjmUzwlJ7S357"},{"name":"OIDC_IDP_AUTH_URL","value":"https://xxxxxxxxxxx-compid-us.okta.com/oauth2/v1/authorize"},{"name":"OIDC_IDP_TOKEN_URL","value":"https://xxxxxxxxxxx-compid-us.okta.com/oauth2/v1/token"},{"name":"OIDC_IDP_LIMIT_DOMAIN","value":""},{"name":"SUB_CLAIM","value":"sub"},{"name":"EMAIL_CLAIM","value":"email"},{"name":"GIVEN_NAME_CLAIM","value":"given_name"},{"name":"FAMILY_NAME_CLAIM","value":"family_name"},{"name":"DISPLAY_NAME_CLAIM","value":"name"},{"name":"GROUPS_CLAIM","value":"groups"},{"name":"OIDC_USER_IN_IDTOKEN","value":"false"},{"name":"OIDC_IDP_USER_URL","value":"https://xxxxxxxxxxx-compid-us.okta.com/oauth2/v1/userinfo"},{"name":"OIDC_SCOPES","value":"openid email profile groups"},{"name":"OU_SVC_NAME","value":"openunison-orchestra.openunison.svc"},{"name":"K8S_TOKEN_TYPE","value":"legacy"}],"openunison_network_configuration":{"activemq_dir":"/tmp/amq","allowed_client_names":[],"ciphers":["TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384","TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384","TLS_RSA_WITH_AES_256_GCM_SHA384","TLS_ECDH_ECDSA_WITH_AES_256_GCM_SHA384","TLS_ECDH_RSA_WITH_AES_256_GCM_SHA384","TLS_DHE_RSA_WITH_AES_256_GCM_SHA384"],"client_auth":"none","force_to_secure":true,"open_external_port":80,"open_port":8080,"path_to_deployment":"/usr/local/openunison/work","path_to_env_file":"/etc/openunison/ou.env","quartz_dir":"/tmp/quartz","secure_external_port":443,"secure_key_alias":"unison-tls","secure_port":8443},"replicas":2,"secret_data":["K8S_DB_SECRET","unisonKeystorePassword","OIDC_CLIENT_SECRET"],"source_secret":"orchestra-secrets-source"}}
    test: operator1
  creationTimestamp: "2022-12-06T06:11:53Z"
  generation: 1
  labels:
    argocd.argoproj.io/instance: orchestra
  name: orchestra
  namespace: openunison
  resourceVersion: "46186801"
  uid: 7444e418-4ba2-4247-836b-67da2c194b88
spec:
  deployment_data:
    liveness_probe_command:
    - /usr/local/openunison/bin/check_alive.py
    node_selectors: []
    pull_secret: jfrog-auth
    readiness_probe_command:
    - /usr/local/openunison/bin/check_alive.py
    - https://127.0.0.1:8443/auth/idp/k8sIdp/.well-known/openid-configuration
    - issuer
    resources:
      limits:
        cpu: 500m
        memory: 2048Mi
      requests:
        cpu: 200m
        memory: 1024Mi
    tokenrequest_api:
      audience: api
      enabled: false
      expirationSeconds: 14400
  dest_secret: orchestra
  enable_activemq: false
  hosts:
  - annotations:
    - name: certmanager.k8s.io/cluster-issuer
      value: letsencrypt
    - name: kubernetes.io/ingress.class
      value: openunison
    ingress_name: openunison
    ingress_type: none
    names:
    - env_var: OU_HOST
      name: login-cluster-test-us-east-1-aws.cf.platform.domain.cloud
    - env_var: K8S_DASHBOARD_HOST
      name: dashboard-cluster-test-us-east-1-aws.cf.platform.domain.cloud
    - env_var: K8S_API_HOST
      name: ou-api-cluster-test-us-east-1-aws.cf.platform.domain.cloud
      service_name: kube-oidc-proxy-orchestra
    secret_name: ou-tls-certificate
  image: xxxxxxxxxxx/openunison-k8s-login-oidc:6e2748ab663d4dd1a2f0039278e05decf8adea5135be16cd0dabedd1946076e4
  key_store:
    key_pairs:
      create_keypair_template:
      - name: ou
        value: CLUSTER Test
      - name: o
        value: Test
      - name: l
        value: CLUSTER
      - name: st
        value: North Virginia
      - name: c
        value: US
      keys:
      - create_data:
          ca_cert: true
          key_size: 2048
          server_name: openunison-orchestra.openunison.svc
          sign_by_k8s_ca: false
          subject_alternative_names:
          - ou-api-cluster-test-us-east-1-aws.cf.platform.domain.cloud
        import_into_ks: keypair
        name: unison-tls
      - create_data:
          ca_cert: true
          delete_pods_labels:
          - k8s-app=kubernetes-dashboard
          key_size: 2048
          secret_info:
            cert_name: dashboard.crt
            key_name: dashboard.key
            type_of_secret: Opaque
          server_name: kubernetes-dashboard.kubernetes-dashboard.svc
          sign_by_k8s_ca: false
          subject_alternative_names: []
          target_namespace: kubernetes-dashboard
        import_into_ks: certificate
        name: kubernetes-dashboard
        replace_if_exists: true
        tls_secret_name: kubernetes-dashboard-certs
      - create_data:
          ca_cert: true
          key_size: 2048
          server_name: unison-saml2-rp-sig
          sign_by_k8s_ca: false
          subject_alternative_names: []
        import_into_ks: keypair
        name: unison-saml2-rp-sig
    static_keys:
    - name: session-unison
      version: 1
    - name: lastmile-oidc
      version: 1
    trusted_certificates: []
    update_controller:
      days_to_expire: 10
      image: docker.io/tremolosecurity/kubernetes-artifact-deployment:1.1.0
      schedule: 0 2 * * *
  myvd_configmap: ""
  non_secret_data:
  - name: K8S_URL
    value: https://ou-api-cluster-test-us-east-1-aws.cf.platform.domain.cloud
  - name: SESSION_INACTIVITY_TIMEOUT_SECONDS
    value: "36000"
  - name: K8S_DASHBOARD_NAMESPACE
    value: kubernetes-dashboard
  - name: K8S_DASHBOARD_SERVICE
    value: kubernetes-dashboard
  - name: K8S_CLUSTER_NAME
    value: cluster-test-us-east-1-aws.cf.platform.domain.cloud
  - name: K8S_IMPERSONATION
    value: "true"
  - name: PROMETHEUS_SERVICE_ACCOUNT
    value: system:serviceaccount:monitoring:prometheus-k8s
  - name: OIDC_CLIENT_ID
    value: 0oa7xj9mjmUzwlJ7S357
  - name: OIDC_IDP_AUTH_URL
    value: https://xxxxxxxxxxx-compid-us.okta.com/oauth2/v1/authorize
  - name: OIDC_IDP_TOKEN_URL
    value: https://xxxxxxxxxxx-compid-us.okta.com/oauth2/v1/token
  - name: OIDC_IDP_LIMIT_DOMAIN
    value: ""
  - name: SUB_CLAIM
    value: sub
  - name: EMAIL_CLAIM
    value: email
  - name: GIVEN_NAME_CLAIM
    value: given_name
  - name: FAMILY_NAME_CLAIM
    value: family_name
  - name: DISPLAY_NAME_CLAIM
    value: name
  - name: GROUPS_CLAIM
    value: groups
  - name: OIDC_USER_IN_IDTOKEN
    value: "false"
  - name: OIDC_IDP_USER_URL
    value: https://xxxxxxxxxxx-compid-us.okta.com/oauth2/v1/userinfo
  - name: OIDC_SCOPES
    value: openid email profile groups
  - name: OU_SVC_NAME
    value: openunison-orchestra.openunison.svc
  - name: K8S_TOKEN_TYPE
    value: legacy
  openunison_network_configuration:
    activemq_dir: /tmp/amq
    allowed_client_names: []
    ciphers:
    - TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384
    - TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384
    - TLS_RSA_WITH_AES_256_GCM_SHA384
    - TLS_ECDH_ECDSA_WITH_AES_256_GCM_SHA384
    - TLS_ECDH_RSA_WITH_AES_256_GCM_SHA384
    - TLS_DHE_RSA_WITH_AES_256_GCM_SHA384
    client_auth: none
    force_to_secure: true
    open_external_port: 80
    open_port: 8080
    path_to_deployment: /usr/local/openunison/work
    path_to_env_file: /etc/openunison/ou.env
    quartz_dir: /tmp/quartz
    secure_external_port: 443
    secure_key_alias: unison-tls
    secure_port: 8443
  replicas: 2
  secret_data:
  - K8S_DB_SECRET
  - unisonKeystorePassword
  - OIDC_CLIENT_SECRET
  source_secret: orchestra-secrets-source
status:
  conditions:
    lastTransitionTime: 2022-12-18 04:14:34GMT
    status: "True"
    type: Completed
  digest: xxxxxxxxxxxxxx8f9hDKdTvxxxxxxxxxeMf689oe79PPVfLzs=
mlbiam commented 1 year ago

It looks like you have prune set to true, is argocd deleting the secret? The logs show the unison-tls Secret is being created.

shnigam2 commented 1 year ago

Hi Marc,

Logs was provided after adding annotations to test and with that secret got created, other than that secret is not creating on its own by operator. For argocd we are not sure if it is getting deleted by it but this is for sure it is not getting recreated on its own.

Regards

mlbiam commented 1 year ago

Let's take a step back, because i think i lost the thread somewhere. What is the actual error you are seeing?

That said, there are two ways a Secret is deleted:

  1. When the openunison object that was used to create it is deleted - This is done by the operator as part of the cleanup process
  2. The check certs CronJob will delete a TLS Secret 10 days before it expires so that the next time the operator runs it will generate a new one.

The operator recreated the unison-tls Secret as expected looking at your logs, but the kube-oidc-proxy Deployment doesn't rely on the unison-tls Secret so I'm not really sure what issue you are seeing. Please provide a specific error message or log.

shnigam2 commented 1 year ago

Hi Marc,

We have found kube-oidc-proxy pod in containercreating state and reason is as below :

"108s  Warning FailedMount pod/kube-oidc-proxy-orchestra-7f9cb569c5-phpk5 MountVolume.SetUp failed for volume "kube-oidc-proxy-tls" : secret "unison-tls" not found"

As you mentioned check certs CronJob will delete a TLS Secret 10 days before it expires so that the next time the operator runs it will generate a new one.

So when will operator run again for creating missing secret again on its own? Because what is happening unison-tls secret found missing(May be due to cert check cronjob) and if somehow kube-oidc-proxy container restarted it is looking for unison-tls, which is actually used by kube-oidc-proxy pod as a mount volume for path /etc/oidc/tls.

mlbiam commented 1 year ago

So when will operator run again for creating missing secret again on its own? Because what is happening unison-tls secret found missing(May be due to cert check cronjob)

The operator gets triggered by the cert check CronJob adding an annotation tremolo.io/cert-manager to the orchestra openunison object. That doesn't look like it happened. I'll look into that. It looks like you have an older version of the operator. There was a bug that was fixed where a long running instance of the operator stopped receiving updates on the watch. That was fixed, so pull in the latest operator image.

and if somehow kube-oidc-proxy container restarted it is looking for unison-tls, which is actually used by kube-oidc-proxy pod as a mount volume for path /etc/oidc/tls.

You're right. Sorry, forgot that the proxy uses it for its own TLS cert. That said, you should now have the unison-tls Secret? According to the logs you posted from the operator when you added the annotation to the orchestra OpenUnison object the secret was created:

Invoking javascript
——————————
Processing key 'unison-tls'
Checking if kubernetes secret exists
Creating keypair
Creating secret
Posting secret
Storing to keystore
Key 'unison-tls' finished
0
1

Once the Secret was created the proxy should have been able to start?

shnigam2 commented 1 year ago

Hi Marc,

yes once secret is found oidc-proxy pod is started running. So to fix auto creation of missing or deleted secret by cronjob we need to update operator to latest where there is a fix for this bug which we have in existing version which we are using of operator?

And also want to know for how many days this secret unison-tls cert is valid?

And is there any dependency of operator version to revision of openUnison orchetra object version, or we can directly go with latest version of operator?

Regards

mlbiam commented 1 year ago

yes once secret is found oidc-proxy pod is started running. So to fix auto creation of missing or deleted secret by cronjob we need to update operator to latest where there is a fix for this bug which we have in existing version which we are using of operator?

Correct

And also want to know for how many days this secret unison-tls cert is valid?

365

And is there any dependency of operator version to revision of openUnison orchetra object version, or we can directly go with latest version of operator?

No dependency, however the openunison deployment you are using is EOL at the end of the year (in about 6 days), so i would recommend upgrading

shnigam2 commented 1 year ago

Hi Marc,

But we have checked as we are calling latest image of operator only. If you see the first values.yaml which I provided to you. Also if you can let us know how to check the running version of operator on pod to cross-check. We are not deploying components separately as we are using helm chart for complete openunison components.

Regards

mlbiam commented 1 year ago

You might have had the latest tag, but it was probably a long time since the pod had restarted. Looking at the logs you provided:

[https://xxx.xx.x.x:443/apis/openunison.tremolo.io/v6/namespaces/openunison/openunisons?watch=true&timeoutSeconds=30&resourceVersion=38933394](https://xxx.xx.x.x/apis/openunison.tremolo.io/v6/namespaces/openunison/openunisons?watch=true&timeoutSeconds=30&resourceVersion=38933394)

There's no allowWatchBookmarks=true (which is what fixed the issue with long running operators not picking up changes). The fix went into the javascript operator (that the openunison operator is built off of) on October 12.

shnigam2 commented 1 year ago

Hi Marc,

I had tried restarting operator pod , but it still not showing allowWatchBookmarks=true in the operator pods logs. Please let me know how to apply latest version of operator so that it picked the latest image of operator, which is having fix of this issue.

>> k get po -n openunison|grep -i operator
openunison-operator-cfd9f7847-cfsll                    1/1     Running                  0          4m30s

>> k logs openunison-operator-cfd9f7847-cfsll -c openunison-operator  -n openunison                      
Using version 'openunison.tremolo.io/v6'
https://xxx.xx0.1:443/apis/openunison.tremolo.io/v6/namespaces/openunison/openunisons?watch=true&timeoutSeconds=30&resourceVersion=46867043
https://xxx.xx.0.1:443/apis/openunison.tremolo.io/v6/namespaces/openunison/openunisons?watch=true&timeoutSeconds=30&resourceVersion=46867043
shnigam2 commented 1 year ago

Hi Marc,

Able to deploy operator version which is having allowwatchbookmarks=true, but still secrets are not creating

>> k logs openunison-operator-6d8d7fcf57-vzsks    -c openunison-operator -n openunison
Using version 'openunison.tremolo.io/v6'
https://xxx.xx.0.1:443/apis/openunison.tremolo.io/v6/namespaces/openunison/openunisons?watch=true&timeoutSeconds=30&allowWatchBookmarks=true&resourceVersion=132866827
Watch failed : {"type":"ERROR","object":{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"too old resource version: 132866827 (233690612)","reason":"Expired","code":410}}
https://172.20.0.1:443/apis/openunison.tremolo.io/v6/namespaces/openunison/openunisons?watch=true&timeoutSeconds=30&allowWatchBookmarks=true
Resource 132866827  has already been processed, skipping
Warning: Nashorn engine is planned to be removed from a future JDK release
Loading Script : '/usr/local/openunison/js/deploy-objs.js'
Loading Script : '/usr/local/openunison/js/deploy-openshift.js'
Loading Script : '/usr/local/openunison/js/deploy-upstream-k8s.js'
Loading Script : '/usr/local/openunison/js/globals.js'
Loading Script : '/usr/local/openunison/js/helpers.js'
Loading Script : '/usr/local/openunison/js/operator.js'
Invoking javascript
in js : {"type":"BOOKMARK","object":{"metadata":{"resourceVersion":"238673341"},"apiVersion":"openunison.tremolo.io\/v6","kind":"OpenUnison"}}
Done invoking javascript
Checking if need to create a status for : 'BOOKMARK'
https://xxx.xx.0.1:443/apis/openunison.tremolo.io/v6/namespaces/openunison/openunisons?watch=true&timeoutSeconds=30&allowWatchBookmarks=true&resourceVersion=238673341

Steps taken : -

  1. Update the image in operator deployment
  2. Re-run the cron-job manually to check if it is sending update to operator about missing secret, but it is not happening.

Please let us know how to proceed further..

shnigam2 commented 1 year ago

Hi Marc,

Can we have an update? We have tested by upgrading operator with latest image and run cert job manually.

Job went into error state and showing below error in logs after deleting secret on 355 days


expiring
cert needs to make it into the openunison keystore, deleting
Key 'unison-saml2-rp-sig' finished
Restarting OpenUnison
Exception in thread "main" javax.script.ScriptException: ReferenceError: "selfLink" is not defined in <eval> at line number 273
    at jdk.nashorn.api.scripting.NashornScriptEngine.throwAsScriptException(NashornScriptEngine.java:470)
    at jdk.nashorn.api.scripting.NashornScriptEngine.evalImpl(NashornScriptEngine.java:454)
    at jdk.nashorn.api.scripting.NashornScriptEngine.evalImpl(NashornScriptEngine.java:406)
    at jdk.nashorn.api.scripting.NashornScriptEngine.evalImpl(NashornScriptEngine.java:402)
    at jdk.nashorn.api.scripting.NashornScriptEngine.eval(NashornScriptEngine.java:150)
    at javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:249)
    at com.tremolosecurity.kubernetes.artifacts.run.RunDeployment.main(RunDeployment.java:123)
Caused by: <eval>:273 ReferenceError: "selfLink" is not defined
    at jdk.nashorn.internal.runtime.ECMAErrors.error(ECMAErrors.java:57)
    at jdk.nashorn.internal.runtime.ECMAErrors.referenceError(ECMAErrors.java:319)
    at jdk.nashorn.internal.runtime.ECMAErrors.referenceError(ECMAErrors.java:291)
    at jdk.nashorn.internal.objects.Global.__noSuchProperty__(Global.java:1442)
    at jdk.nashorn.internal.scripts.Script$\^eval\_.:program(<eval>:273)
    at jdk.nashorn.internal.runtime.ScriptFunctionData.invoke(ScriptFunctionData.java:637)
    at jdk.nashorn.internal.runtime.ScriptFunction.invoke(ScriptFunction.java:494)
    at jdk.nashorn.internal.runtime.ScriptRuntime.apply(ScriptRuntime.java:393)
    at jdk.nashorn.api.scripting.NashornScriptEngine.evalImpl(NashornScriptEngine.java:449)
    ... 5 more

After this cron job again run and get completed without doing anything : -

d email profile groups"},{"name":"OU_SVC_NAME","value":"openunison-orchestra.openunison.svc"},{"name":"K8S_TOKEN_TYPE","value":"legacy"}],"openunison_network_configuration":{"activemq_dir":"/tmp/amq","allowed_client_names":[],"ciphers":["TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384","TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384","TLS_RSA_WITH_AES_256_GCM_SHA384","TLS_ECDH_ECDSA_WITH_AES_256_GCM_SHA384","TLS_ECDH_RSA_WITH_AES_256_GCM_SHA384","TLS_DHE_RSA_WITH_AES_256_GCM_SHA384"],"client_auth":"none","force_to_secure":true,"open_external_port":80,"open_port":8080,"path_to_deployment":"/usr/local/openunison/work","path_to_env_file":"/etc/openunison/ou.env","quartz_dir":"/tmp/quartz","secure_external_port":443,"secure_key_alias":"unison-tls","secure_port":8443},"replicas":2,"secret_data":["K8S_DB_SECRET","unisonKeystorePassword","OIDC_CLIENT_SECRET"],"source_secret":"orchestra-secrets-source"},"status":{"conditions":{"lastTransitionTime":"2022-07-27 09:18:33GMT","status":"True","type":"Completed"},"digest":"Ttg9THkR3aGB2ka/qKR1SfrSmRxqCgGPeulcSc26Poc="}}],"kind":"OpenUnisonList","metadata":{"continue":"","resourceVersion":"377094519"}}
}
openunisons found

Processing key 'unison-tls'
Checking if kubernetes secret exists
Key 'unison-tls' finished

Processing key 'kubernetes-dashboard'
Checking if kubernetes secret exists
Key 'kubernetes-dashboard' finished

Processing key 'unison-saml2-rp-sig'
Checking if kubernetes secret exists
Key 'unison-saml2-rp-sig' finished

Request you to please update about the next steps.

mlbiam commented 1 year ago

Try changing imagePullPolicy on the check-certs-orchestra CronJob from IfNotPresent to Always and run it manually (you can do it in the dashboard or using the exec-cronjob kubectl plugin). Do you still get the "selfLink" is not defined error?

shnigam2 commented 1 year ago

Hi Marc,

Last time we run cronjob manually and we found job into error state by selflink erros. We are just waiting for cronjob to on its own tonight.

Just one question if it runs on its own as per the schedule , so will it recreate secret or not? could you please update us.

mlbiam commented 1 year ago

I asked you to make the change because it appears that it's using an old container. Changing imagePullPolicy to Always would make sure you're pulling the latest artifact-deployment image which has the fix to avound the selfCheck error.

shnigam2 commented 1 year ago

If certs got deleted by any job which prompted error of selflink while restarting openunison, will only same job will update opunison to re-trigger if secret is missing. Or if I create job manually and run it will also restart openunison if secrets are missing.

shnigam2 commented 1 year ago

We have tried by using below things 1) Latest image of operator 2) 1.1.0 image of kubernetes-artifact-deployment

Identified one cluster which is having secret of 327d, edit certcronjob with imagepullpolicy=Always and set days expiry to 38 days to get this secret deleted. secret get deleted and job went into error state again with selflink errors in logs:

Exception in thread "main" javax.script.ScriptException: ReferenceError: "selfLink" is not defined in <eval> at line number 273
    at jdk.nashorn.api.scripting.NashornScriptEngine.throwAsScriptException(NashornScriptEngine.java:470)
    at jdk.nashorn.api.scripting.NashornScriptEngine.evalImpl(NashornScriptEngine.java:454)
    at jdk.nashorn.api.scripting.NashornScriptEngine.evalImpl(NashornScriptEngine.java:406)
    at jdk.nashorn.api.scripting.NashornScriptEngine.evalImpl(NashornScriptEngine.java:402)
    at jdk.nashorn.api.scripting.NashornScriptEngine.eval(NashornScriptEngine.java:150)
    at javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:249)
    at com.tremolosecurity.kubernetes.artifacts.run.RunDeployment.main(RunDeployment.java:123)
Caused by: <eval>:273 ReferenceError: "selfLink" is not defined
    at jdk.nashorn.internal.runtime.ECMAErrors.error(ECMAErrors.java:57)
    at jdk.nashorn.internal.runtime.ECMAErrors.referenceError(ECMAErrors.java:319)
    at jdk.nashorn.internal.runtime.ECMAErrors.referenceError(ECMAErrors.java:291)
    at jdk.nashorn.internal.objects.Global.__noSuchProperty__(Global.java:1442)
    at jdk.nashorn.internal.scripts.Script$\^eval\_.:program(<eval>:273)
    at jdk.nashorn.internal.runtime.ScriptFunctionData.invoke(ScriptFunctionData.java:637)
    at jdk.nashorn.internal.runtime.ScriptFunction.invoke(ScriptFunction.java:494)
    at jdk.nashorn.internal.runtime.ScriptRuntime.apply(ScriptRuntime.java:393)
    at jdk.nashorn.api.scripting.NashornScriptEngine.evalImpl(NashornScriptEngine.java:449)
    ... 5 more

Cronjob Yaml :

apiVersion: batch/v1
kind: CronJob
metadata:
  creationTimestamp: "2022-02-17T13:38:29Z"
  generation: 5
  labels:
    app: openunison-orchestra
    operated-by: openunison-operator
  name: check-certs-orchestra
  namespace: openunison
  resourceVersion: "186489759"
  uid: b299d0f7-9065-4d4a-95f0-ecaa160bf9e2
spec:
  concurrencyPolicy: Allow
  failedJobsHistoryLimit: 1
  jobTemplate:
    metadata:
      creationTimestamp: null
    spec:
      backoffLimit: 1
      template:
        metadata:
          creationTimestamp: null
        spec:
          containers:
          - command:
            - java
            - -jar
            - /usr/local/artifactdeploy/artifact-deploy.jar
            - -extraCertsPath
            - /etc/extracerts
            - -installScriptURL
            - file:///etc/input-maps/cert-check.js
            - -kubernetesURL
            - https://kubernetes.default.svc.cluster.local
            - -rootCaPath
            - /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
            - -secretsPath
            - /etc/input-maps/input.props
            - -tokenPath
            - /var/run/secrets/kubernetes.io/serviceaccount/token
            - -deploymentTemplate
            - file:///etc/input-maps/deployment.yaml
            env:
            - name: CERT_DAYS_EXPIRE
              value: "38"
            image: docker.io/tremolosecurity/kubernetes-artifact-deployment:1.1.0
            imagePullPolicy: Always
            name: check-certs-orchestra
            resources: {}
            terminationMessagePath: /dev/termination-log
            terminationMessagePolicy: File
            volumeMounts:
            - mountPath: /etc/extracerts
              name: extra-certs-dir
              readOnly: true
            - mountPath: /etc/input-maps
              name: input-maps
              readOnly: true
          dnsPolicy: ClusterFirst
          restartPolicy: Never
          schedulerName: default-scheduler
          securityContext: {}
          serviceAccount: openunison-operator
          serviceAccountName: openunison-operator
          terminationGracePeriodSeconds: 30
          volumes:
          - configMap:
              defaultMode: 420
              name: cert-controller-js-orchestra
            name: extra-certs-dir
          - configMap:
              defaultMode: 420
              name: cert-controller-js-orchestra
            name: input-maps
  schedule: 0 2 * * *
  successfulJobsHistoryLimit: 3
  suspend: false
status:
  lastScheduleTime: "2023-01-11T02:00:00Z"
  lastSuccessfulTime: "2023-01-11T03:19:08Z"
mlbiam commented 1 year ago

Strange, I just checked and that code is gone. Try changing the image to docker.io/tremolosecurity/betas:kad110. It's the same image, but for some reason your cluster isn't pulling the latest image.

shnigam2 commented 1 year ago

Hi Marc,

Is this image we need to use for check-cert-cronjob ?

mlbiam commented 1 year ago

Yes

shnigam2 commented 1 year ago

Hi Marc,

Same error again, please find the cron yaml, edit the cert day expire day for testing. Result is same secret got deleted but on restarting openunison giving same error.

apiVersion: v1
kind: Pod
metadata:
  annotations:
    cni.projectcalico.org/containerID: 0edc67caa236c433da8d1c0f2f9e5a54ca4ddfb5978aaa8ed7df631bade51ba7
    cni.projectcalico.org/podIP: ""
    cni.projectcalico.org/podIPs: ""
    kubernetes.io/psp: eks.privileged
  creationTimestamp: "2023-01-11T12:07:16Z"
  generateName: check-certs-orchestra-27890050-
  labels:
    controller-uid: 3dbbe738-65e5-4283-88ef-3c963775686f
    job-name: check-certs-orchestra-27890050
  name: check-certs-orchestra-27890050-dkl4l
  namespace: openunison
  ownerReferences:
  - apiVersion: batch/v1
    blockOwnerDeletion: true
    controller: true
    kind: Job
    name: check-certs-orchestra-27890050
    uid: 3dbbe738-65e5-4283-88ef-3c963775686f
  resourceVersion: "181461100"
  uid: d878abaf-6c7a-45b4-9ab0-085cbc39718d
spec:
  containers:
  - command:
    - java
    - -jar
    - /usr/local/artifactdeploy/artifact-deploy.jar
    - -extraCertsPath
    - /etc/extracerts
    - -installScriptURL
    - file:///etc/input-maps/cert-check.js
    - -kubernetesURL
    - https://kubernetes.default.svc.cluster.local
    - -rootCaPath
    - /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    - -secretsPath
    - /etc/input-maps/input.props
    - -tokenPath
    - /var/run/secrets/kubernetes.io/serviceaccount/token
    - -deploymentTemplate
    - file:///etc/input-maps/deployment.yaml
    env:
    - name: CERT_DAYS_EXPIRE
      value: "57"
    image: docker.io/tremolosecurity/betas:kad110
    imagePullPolicy: IfNotPresent
    name: check-certs-orchestra
    resources: {}
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /etc/extracerts
      name: extra-certs-dir
      readOnly: true
    - mountPath: /etc/input-maps
      name: input-maps
      readOnly: true
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-89kb8
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  nodeName: ip-10-132-160-223.ec2.internal
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  restartPolicy: Never
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: openunison-operator
  serviceAccountName: openunison-operator
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - configMap:
      defaultMode: 420
      name: cert-controller-js-orchestra
    name: extra-certs-dir
  - configMap:
      defaultMode: 420
      name: cert-controller-js-orchestra
    name: input-maps
  - name: kube-api-access-89kb8
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          expirationSeconds: 3607
          path: token
      - configMap:
          items:
          - key: ca.crt
            path: ca.crt
          name: kube-root-ca.crt
      - downwardAPI:
          items:
          - fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
            path: namespace
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2023-01-11T12:07:16Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2023-01-11T12:08:09Z"
    message: 'containers with unready status: [check-certs-orchestra]'
    reason: ContainersNotReady
    status: "False"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2023-01-11T12:08:09Z"
    message: 'containers with unready status: [check-certs-orchestra]'
    reason: ContainersNotReady
    status: "False"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2023-01-11T12:07:16Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: docker://a8819118450159615ce65d20964437d0c03373158481a059318aaafe40e79708
    image: tremolosecurity/betas:kad110
    imageID: docker-pullable://tremolosecurity/betas@sha256:7c83e9b6ee0918d3db0fda9f4a0dd09cf1983d0896dee58cf8a318953b40f6e1
    lastState: {}
    name: check-certs-orchestra
    ready: false
    restartCount: 0
    started: false
    state:
      terminated:
        containerID: docker://a8819118450159615ce65d20964437d0c03373158481a059318aaafe40e79708
        exitCode: 1
        finishedAt: "2023-01-11T12:08:08Z"
        reason: Error
        startedAt: "2023-01-11T12:08:06Z"

Error which we receive again: -

expiring
cert needs to make it into the openunison keystore, deleting
Key 'unison-saml2-rp-sig' finished
Restarting OpenUnison
Exception in thread "main" javax.script.ScriptException: ReferenceError: "selfLink" is not defined in <eval> at line number 273
    at jdk.nashorn.api.scripting.NashornScriptEngine.throwAsScriptException(NashornScriptEngine.java:470)
    at jdk.nashorn.api.scripting.NashornScriptEngine.evalImpl(NashornScriptEngine.java:454)
    at jdk.nashorn.api.scripting.NashornScriptEngine.evalImpl(NashornScriptEngine.java:406)
    at jdk.nashorn.api.scripting.NashornScriptEngine.evalImpl(NashornScriptEngine.java:402)
    at jdk.nashorn.api.scripting.NashornScriptEngine.eval(NashornScriptEngine.java:150)
    at javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:249)
    at com.tremolosecurity.kubernetes.artifacts.run.RunDeployment.main(RunDeployment.java:123)
Caused by: <eval>:273 ReferenceError: "selfLink" is not defined
    at jdk.nashorn.internal.runtime.ECMAErrors.error(ECMAErrors.java:57)
    at jdk.nashorn.internal.runtime.ECMAErrors.referenceError(ECMAErrors.java:319)
    at jdk.nashorn.internal.runtime.ECMAErrors.referenceError(ECMAErrors.java:291)
    at jdk.nashorn.internal.objects.Global.__noSuchProperty__(Global.java:1442)
    at jdk.nashorn.internal.scripts.Script$\^eval\_.:program(<eval>:273)
    at jdk.nashorn.internal.runtime.ScriptFunctionData.invoke(ScriptFunctionData.java:637)
    at jdk.nashorn.internal.runtime.ScriptFunction.invoke(ScriptFunction.java:494)
    at jdk.nashorn.internal.runtime.ScriptRuntime.apply(ScriptRuntime.java:393)
    at jdk.nashorn.api.scripting.NashornScriptEngine.evalImpl(NashornScriptEngine.java:449)
    ... 5 more
mlbiam commented 1 year ago

Which version of the openunison-operator helm chart are you referencing in your argocd project?

shnigam2 commented 1 year ago

Hi @mlbiam Below are the ArgoCD objects with helm values. We have tried by upgrading operator image as well which is having allowwatchbookmarks=true paramater, but that also prompting same error.

Openunison

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  annotations:
    argocd.argoproj.io/sync-wave: '-2'
  name: openunison
  namespace: argocd
spec:
  destination:
    namespace: openunison
    server: 'https://xxxxxxxx'
  project: cnt
  source:
    chart: openunison-operator
    helm:
      releaseName: openunison
      values: |-
        {
          "image": "xxxxxxxx/openunison-k8s-operator:xxxxxxxx",
          "services": {
            "pullSecret": "jfrog-auth"
          }
        }
    repoURL: 'https://nexus.tremolo.io/repository/helm/'
    targetRevision: 2.0.6
  syncPolicy:
    automated:
      prune: true
    syncOptions:
      - ApplyOutOfSyncOnly=true
mlbiam commented 1 year ago

Going back through the code, I think there's a chicken-and-egg problem. The operator creates and updates the check-certs configmap, which has the code that is broken. So if you're running the latest operator, "touching" the orchestra openunison object will cause the check-certs ConfigMap to get updated. The problem is that has to happen before your CronJob runs. So to get a cluster updated:

  1. Update to the latest openunison operator image (this contains the fixed javascript)
  2. "touch" your orchestra openunison object by adding an annotation, forcing the operator to fix the ConfigMap (openunison will redeploy)

At that point you should have a cluster that will automatically fix its certificates. We're going to move that configmap map out of the operator. Doesn't really need to be there anymore.

shnigam2 commented 1 year ago

Hi @mlbiam ,

Yes we have tried that doing annotation on openunison orchestra object will create missing secrets.

So by moving configmap out of operator will fix recreation of secret will happen without manual touching openunison orchestra object?

Regards Shobhit

mlbiam commented 1 year ago

Yes we have tried that doing annotation on openunison orchestra object will create missing secrets.

right, but i'm talking about the cert-controller-js-orchestra ConfigMap. This configmap stores the javascript that checks the certificates and contains the bug that keeps the cronjob from patching the orchestra openunison object to trigger the redeployment. This configmap is generated and updated by the operator (remnant of when all of openunison's manifests were managed by the operator). So by updating to the latest image of the operator, and adding an annotation to the orchestra openunison object the operator will see that there's a new version of the configmap and update what's in the API server.

So by moving configmap out of operator will fix recreation of secret will happen without manual touching openunison orchestra object?

It will eliminate the need for the operator to create and manage the configmap that has the code in it. The combination of updating the configmap and getting the latest operator will ultimately be the final fix so that the certificate update happens in an automated way.

That said, if you follow the above instructions (get the latest operator image and add the annotation to the orchestra object) you'll see the cert-controller-js-orchestra is updated. At that point, the next time the cert-check job runs the process will be automated and you won't need to intervene.

I'll point out that the repository you are deploying from (https://github.com/openunison/openunison-k8s-login-oidc) is officially End Of Life (EOL). So the update will go into the orchestra chart, but it won't be in the openunison-k8s-login-oidc repository.

shnigam2 commented 1 year ago

Hi @mlbiam,

Is there any specific annotation we are expecting in orchestra or any annotation like we did earlier for creating secret. Because in one cluster I followed the same :

  1. Deleted the secret unison-tls manually first
  2. Update the operator image to latest which is having allowwatchbookmark=true.
  3. Annotate orchestra to recreate secret.(Secret got created)
  4. Request you to have a look of configmap which we have after doing all above steps. And suggest if we again delete secret so check-cert-job will recreate secret on its own through configmap.
    
    >> k edit openunison orchestra -n openunison
    openunison.openunison.tremolo.io/orchestra edited
    >> k get secret -n openunison               
    NAME                               TYPE                                  DATA   AGE
    default-token-gfbzn                kubernetes.io/service-account-token   3      356d
    dockerhub-auth                     kubernetes.io/dockerconfigjson        1      356d
    jfrog-auth                         kubernetes.io/dockerconfigjson        1      356d
    openunison-operator-token-lw4g7    kubernetes.io/service-account-token   3      356d
    openunison-orchestra-token-fqf8v   kubernetes.io/service-account-token   3      356d
    orchestra                          Opaque                                4      356d
    orchestra-secrets-source           Opaque                                3      356d
    orchestra-static-keys              Opaque                                2      5s
    ou-tls-certificate                 kubernetes.io/tls                     3      351d
    ou-tls-main-certificate            kubernetes.io/tls                     3      351d
    root-ca                            Opaque                                1      356d
    unison-saml2-rp-sig                kubernetes.io/tls                     2      43h
    unison-tls                         kubernetes.io/tls                     2      5s
    venafi-token                       Opaque                                1      356d

k get cm -n openunison NAME DATA AGE api-server-config 1 356d cert-controller-js-orchestra 4 356d kube-root-ca.crt 1 356d

configmap yaml

k get cm cert-controller-js-orchestra -n openunison -o yaml apiVersion: v1 data: cert-check.js: "var CertUtils = Java.type(\"com.tremolosecurity.kubernetes.artifacts.util.CertUtils\");\nvar NetUtil = Java.type(\"com.tremolosecurity.kubernetes.artifacts.util.NetUtil\");\nvar k8s_namespace = 'openunison';\nvar redploy_openunison = false;\nvar System = Java.type(\"java.lang.System\");\nvar Integer = Java.type(\"java.lang.Integer\")\n\n\nfunction process_key_pair_config(cfg_obj,key_config) {\n print(\"\n\nProcessing key '\" + key_config.name + \"'\");\n create_keypair_template = cfg_obj.key_store.key_pairs.create_keypair_template;\n\n secret_info = key_config.create_data.secret_info;\n\n \ if (secret_info == null) {\n secret_info = {};\n secret_info['type_of_secret'] = 'kubernetes.io/tls';\n secret_info['cert_name'] = 'tls.crt';\n secret_info['key_name'] = 'tls.key';\n }\n\n //determine the namespace of the secret\n target_ns = k8s_namespace;\n if (key_config.create_data.target_namespace != null && key_config.create_data.target_namespace !== \"\") {\n target_ns = key_config.create_data.target_namespace;\n }\n\n \ var secret_name = \"\";\n if (key_config.tls_secret_name != null && key_config.tls_secret_name !== \"\") {\n secret_name = key_config.tls_secret_name;\n } else {\n \ secret_name = key_config.name;\n }\n\n //check if the secret already exists\n print(\"Checking if kubernetes secret exists\")\n secret_response = k8s.callWS(\"/api/v1/namespaces/\" + target_ns + \"/secrets/\" + secret_name,\"\",-1);\n \ secret_exists = false;\n\n if (secret_response.code == 200) {\n print(\"Secret exists\")\n secret_json = JSON.parse(secret_response.data);\n \n \ \n if (secret_json.metadata != null && secret_json.metadata.labels != null && secret_json.metadata.labels[\"operated-by\"] != null && secret_json.metadata.labels[\"operated-by\"] == \"openunison-operator\") {\n \n //Managed by the operator, lets see if it needs to be rebuilt\n\n //first, check to see if the cert is going to expire\n var cert_from_secret = new java.lang.String(java.util.Base64.getDecoder().decode(secret_json.data[secret_info.cert_name]));\n \ print(cert_from_secret);\n if (CertUtils.isCertExpiring(CertUtils.string2cert(secret_json.data[secret_info.cert_name]),Integer.parseInt(System.getenv(\"CERT_DAYS_EXPIRE\")))) {\n print(\"expiring\");\n\n if (key_config.import_into_ks === \"keypair\" || key_config.import_into_ks === \"certificate\") {\n print(\"cert needs to make it into the openunison keystore, deleting\");\n k8s.deleteWS(\"/api/v1/namespaces/\"

  • target_ns + \"/secrets/\" + secret_name);\n redploy_openunison = true;\n } else {\n print(\"secret needs to be recreated\");\n create_certificate(target_ns,cfg_obj,key_config,secret_info,secret_name);\n \ }\n } else {\n print(\"not expiring\");\n \ }\n\n\n }\n }\n\n /\n \n /\n \n print(\"Key '\" + key_config.name + \"' finished\");\n\n\n\n\n\n}\n\n\n\n\n\nfunction create_certificate(target_ns,cfg_obj,key_config,secret_info,secret_name) {\n print(\"Creating keypair\");\n\n //time to create the keypair\n //process the create template and the ca cert flag\n certInfo = {};\n for (var i=0;i<create_keypair_template.length;i++) {\n certInfo[create_keypair_template[i].name] = create_keypair_template[i].value;\n \ }\n certInfo[\"caCert\"] = key_config.create_data.ca_cert;\n certInfo[\"size\"] = key_config.create_data.key_size;\n\n //figure out the server name/cn and subject alternative names\n server_name = key_config.create_data.server_name;\n \ certInfo[\"serverName\"] = server_name;\n\n if (key_config.create_data.subject_alternative_names != null && key_config.create_data.subject_alternative_names.length > 0) {\n certInfo[\"subjectAlternativeNames\"] = [];\n for (i=0;i<key_config.create_data.subject_alternative_names.length;i++) {\n certInfo[\"subjectAlternativeNames\"].push(script_val(key_config.create_data.subject_alternative_names[i]));\n \ }\n }\n\n\n\n x509data = CertUtils.createCertificate(certInfo);\n\n \ if (key_config.create_data.sign_by_k8s_ca) {\n print(\"Signing by Kubernetes' CA\");\n csrReq = {\n \"apiVersion\": \"certificates.k8s.io/v1beta1\",\n \ \"kind\": \"CertificateSigningRequest\",\n \"metadata\": {\n \"name\": server_name,\n },\n \"spec\": {\n \"request\": java.util.Base64.getEncoder().encodeToString(CertUtils.generateCSR(x509data).getBytes(\"utf-8\")),\n \ \"usages\": [\n \"digital signature\",\n \"key encipherment\",\n \"server auth\"\n ]\n }\n \ };\n\n print(\"Posting CSR\");\n apiResp = k8s.postWS('/apis/certificates.k8s.io/v1beta1/certificatesigningrequests',JSON.stringify(csrReq));\n\n \ if (apiResp.code == 409) {\n print(\"Existing CSR, deleting\");\n \ k8s.deleteWS('/apis/certificates.k8s.io/v1beta1/certificatesigningrequests/'
  • server_name);\n apiResp = k8s.postWS('/apis/certificates.k8s.io/v1beta1/certificatesigningrequests',JSON.stringify(csrReq));\n \ }\n\n approveReq = JSON.parse(apiResp.data);\n approveReq.status.conditions = [\n {\n \"type\":\"Approved\",\n \"reason\":\"OpenUnison Deployment\",\n \"message\":\"This CSR was approved by the OpenUnison operator\"\n }\n ];\n\n print(\"Approving CSR\");\n apiResp = k8s.putWS('/apis/certificates.k8s.io/v1beta1/certificatesigningrequests/' + server_name + '/approval',JSON.stringify(approveReq));\n \n print(\"Retrieving signed certificate\");\n apiResp = k8s.callWS('/apis/certificates.k8s.io/v1beta1/certificatesigningrequests/'
  • server_name);\n\n certResp = JSON.parse(apiResp.data);\n b64cert = certResp.status.certificate;\n\n if (b64cert == null || b64cert === \"\") {\n print(\"CertManager is not enabled on this cluster. Change sign_by_k8s_cluster to false\");\n exit(1);\n }\n\n CertUtils.importSignedCert(x509data,b64cert);\n\n \ \n\n\n }\n\n \n \n //create tls secret\n print(\"Creating secret\");\n \n\n secret_to_create = {\n \"apiVersion\":\"v1\",\n \ \"kind\":\"Secret\",\n \"type\":secret_info.type_of_secret,\n \"metadata\": {\n \"name\": secret_name,\n \"namespace\": target_ns,\n \ \"labels\": {\n \"tremolo_operator_created\":\"true\",\n \ \"operated-by\": \"openunison-operator\"\n }\n },\n \ \"data\":{\n \n }\n };\n\n secret_to_create.data[ secret_info.cert_name ] = java.util.Base64.getEncoder().encodeToString(CertUtils.exportCert(x509data.getCertificate()).getBytes(\"UTF-8\"));\n \ secret_to_create.data[ secret_info.key_name ] = java.util.Base64.getEncoder().encodeToString(CertUtils.exportKey(x509data.getKeyData().getPrivate()).getBytes(\"UTF-8\"));\n\n\n \ \n print(\"Deleting existing secret\");\n k8s.deleteWS(\"/api/v1/namespaces/\"
  • target_ns + \"/secrets/\" + secret_name);\n \n\n print(\"Posting secret\");\n \ k8s.postWS('/api/v1/namespaces/' + target_ns + '/secrets',JSON.stringify(secret_to_create));\n\n \ \n if (! isEmpty(key_config.create_data.patch_info)) {\n print(\"Patching to push updates\");\n var annotation_value = \"\";\n var patch = {\"metadata\":{\"annotations\" : {}}};\n\n patch.metadata.annotations[key_config.create_data.patch_info.annotation_name] = annotation_value;\n k8s.patchWS(key_config.create_data.patch_info.obj_url,JSON.stringify(patch));\n\n \ } else if (key_config.create_data.delete_pods_labels != null && key_config.create_data.delete_pods_labels.length 0) {\n print(\"Deleting pods per labels\");\n var label_selectors = '';\n for (var ii = 0;ii < key_config.create_data.delete_pods_labels.length;ii++) {\n if (ii > 0) {\n label_selectors = label_selectors
  • '&';\n }\n\n label_selectors = label_selectors + key_config.create_data.delete_pods_labels[ii];\n \ }\n pods_list_response = k8s.deleteWS('/api/v1/namespaces/' + target_ns
  • '/pods?labelSelector=' + label_selectors);\n print(\"Pods deleted\");\n \ }\n \n}\n\n\n\n\n\n\n\n\n\n\n\n\n\nprint(\"Loading openunisons\");\nuriBase = '/apis/openunison.tremolo.io/v1/namespaces/openunison/openunisons';\nsearch_res = k8s.callWS(uriBase);\nprint(search_res);\nif (search_res.code == 200) {\n print(\"openunisons found\");\n openunisons = JSON.parse(search_res.data)[\"items\"];\n for (var i = 0;i<openunisons.length;i++) {\n var openunison = openunisons[i];\n\n \ var keys = openunison.spec.key_store.key_pairs.keys;\n for (var j = 0;j<keys.length;j++) {\n var key = keys[j];\n process_key_pair_config(openunison.spec,key);\n \ }\n\n \n\n }\n\n if (redploy_openunison) {\n print(\"Restarting OpenUnison\");\n patch = {\n \"metadata\": {\n \"annotations\": {\n \"tremolo.io/cert-manager\": (new org.joda.time.DateTime().toString())\n \ }\n }\n };\n\n selfLink = uriBase
  • \"/\" + openunison.metadata.name;\n\n k8s.patchWS(selfLink,JSON.stringify(patch));\n \ }\n} else {\n print(\"Error - could not load openunisons - \" + JSON.stringify(search_res));\n}\n" deployment.yaml: "" diget: hDklvin6qlBt8RYTrF3Cm7tkykcF+J5l2gOu7fnHS90= input.props: "" kind: ConfigMap metadata: creationTimestamp: "2022-01-20T08:12:00Z" labels: app: openunison-orchestra operated-by: openunison-operator name: cert-controller-js-orchestra namespace: openunison resourceVersion: "377145264" uid: 468888fd-79ab-40f7-aaf8-10844f816557
shnigam2 commented 1 year ago

Hi @mlbiam,

Updated configmap worked thanks

shnigam2 commented 1 year ago

Hi @mlbiam ,

Also we are facing issue where orchestra pods are giving below error, which is causing url's to not work properly.

[2023-01-10 14:59:21,971][Thread-11] ERROR K8sWatcher - Could not run watch, waiting 10 seconds

So is that operator image upgrade & configmap update also fix this error as well. Please let us know

Regards Shobhit

mlbiam commented 1 year ago

Please open a new issue and include the full stack trace.

mlbiam commented 1 year ago

This issue appears to be resolved, closing