Soluto / kamus

An open source, git-ops, zero-trust secret encryption and decryption solution for Kubernetes applications
https://kamus.soluto.io
Apache License 2.0
926 stars 67 forks source link

kamus-controller not working and no log output #577

Closed t0st closed 3 years ago

t0st commented 4 years ago

Describe the bug I updated kamus via Helm Template from 0.4.9 to 0.4.12. Since then the kamus-controller stopped working. No secret gets created if I create a KamusSecret. Also the kamus-controller has no log output. If I change the image back from "soluto/kamus:controller-0.7.0.0" to "soluto/kamus:controller-0.6.7.0" then I get the normal Kestrel log output as expected.

I also tried to enable debug logging via adding "Serilog__MinimumLevel: Debug" to kamus-controller configmap but the log remains empty.

Versions used Kamus (API images): 0.7.0.0 Chart version: 0.4.12 KMS provider: AWS KMS Kubernetes flavour and version: oc version Client Version: 4.5.0-0.okd-2020-07-14-153706-ga Server Version: 4.5.0-0.okd-2020-07-14-153706-ga Kubernetes Version: v1.18.3

To Reproduce Steps to reproduce the behavior:

  1. helm fetch soluto/kamus --version 0.4.12 --untar --untardir ./charts
  2. helm template kamus --values ./values/kamus/values.yaml --output-dir ./manifests ./charts/kamus
  3. kubectl apply -f manifests/kamus/templates/
  4. kubectl apply -f kamus-secret.yaml -n kamus
  5. kubectl logs -l "app=kamus,component=controller"

cat kamus-secret.yaml
apiVersion: soluto.com/v1alpha2 kind: KamusSecret metadata: name: yet another test type: kubernetes.io/dockerconfigjson data: .dockerconfigjson: >- env$SomeEncryptedString

Expected behavior The kamus-controller should react to the create-event of the kamusSecret and create a secret with encrypted content

shaikatz commented 4 years ago

We've just released version 0.7.1.0 (chart 0.4.13) which supposed to fix some regression that was introduced in AWS KMS usage. Let me know if it helps.

t0st commented 4 years ago

Sadly not :-/

Thats the pod:

oc get pod kamus-controller-679cf9fb5-4vgm5 -o yaml

kind: Pod
apiVersion: v1
metadata:
  generateName: kamus-controller-679cf9fb5-
  annotations:
    checksum/config: cb46ab176776148b126d60587d40f547807c44cac055862466d1854e97826e5a
    checksum/secret: a8611f673ed15ddcd8f37b4ea632c9003ba544e27c001caeb1e290c50563e1fe
    checksum/tls-secret: f0d3e8d9e9fd55cad23deab43c5823790547ded75baa18360990d66af9aea8ea
    k8s.v1.cni.cncf.io/network-status:
    k8s.v1.cni.cncf.io/networks-status:
    openshift.io/scc: restricted
  selfLink: /api/v1/namespaces/kamus/pods/kamus-controller-679cf9fb5-4vgm5
  resourceVersion: '64390504'
  name: kamus-controller-679cf9fb5-4vgm5
  uid:
  creationTimestamp: '2020-08-26T09:55:02Z'
  namespace: kamus
  ownerReferences:
    - apiVersion: apps/v1
      kind: ReplicaSet
      name: kamus-controller-679cf9fb5
      uid: 
      controller: true
      blockOwnerDeletion: true
  labels:
    app: kamus
    component: controller
    pod-template-hash: 679cf9fb5
    release: kamus
spec:
  restartPolicy: Always
  serviceAccountName: kamus-controller
  imagePullSecrets:
    - name: kamus-controller-dockercfg-59q84
  priority: 0
  schedulerName: default-scheduler
  enableServiceLinks: true
  terminationGracePeriodSeconds: 30
  nodeName: 
  securityContext:
    seLinuxOptions:
      level: 's0:c28,c27'
    fsGroup: 1000810000
  containers:
    - resources:
        limits:
          cpu: 100m
          memory: 128Mi
        requests:
          cpu: 100m
          memory: 128Mi
      readinessProbe:
        httpGet:
          path: /healthz
          port: 9999
          scheme: HTTP
        timeoutSeconds: 1
        periodSeconds: 10
        successThreshold: 1
        failureThreshold: 3
      terminationMessagePath: /dev/termination-log
      name: controller
      livenessProbe:
        httpGet:
          path: /healthz
          port: 9999
          scheme: HTTP
        timeoutSeconds: 1
        periodSeconds: 10
        successThreshold: 1
        failureThreshold: 3
      securityContext:
        capabilities:
          drop:
            - KILL
            - MKNOD
            - SETGID
            - SETUID
        runAsUser: 1000810000
      imagePullPolicy: IfNotPresent
      volumeMounts:
        - name: secret-volume
          mountPath: /home/dotnet/app/secrets
        - name: tls-secret-volume
          mountPath: /home/dotnet/app/tls
        - name: kamus-controller-token-nnj4w
          readOnly: true
          mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      terminationMessagePolicy: File
      envFrom:
        - configMapRef:
            name: kamus-controller
      image: 'soluto/kamus:controller-0.7.1.0'
  automountServiceAccountToken: true
  serviceAccount: kamus-controller
  volumes:
    - name: secret-volume
      secret:
        secretName: kamus
        defaultMode: 420
    - name: tls-secret-volume
      secret:
        secretName: kamus-controller
        defaultMode: 420
    - name: kamus-controller-token-nnj4w
      secret:
        secretName: kamus-controller-token-nnj4w
        defaultMode: 420
  dnsPolicy: ClusterFirst
  tolerations:
    - key: node.kubernetes.io/not-ready
      operator: Exists
      effect: NoExecute
      tolerationSeconds: 300
    - key: node.kubernetes.io/unreachable
      operator: Exists
      effect: NoExecute
      tolerationSeconds: 300
    - key: node.kubernetes.io/memory-pressure
      operator: Exists
      effect: NoSchedule
status:
  phase: Running
  conditions:
    - type: Initialized
      status: 'True'
      lastProbeTime: null
      lastTransitionTime: '2020-08-26T09:55:02Z'
    - type: Ready
      status: 'True'
      lastProbeTime: null
      lastTransitionTime: '2020-08-26T09:55:37Z'

Still no log output and no secret.

t0st commented 4 years ago

If i switch back to image "soluto/kamus:controller-0.6.7.0" I get this log output immediately:

oc logs -l "app=kamus,component=controller"
Hosting environment: Production
Content root path: /home/dotnet/app
Now listening on: https://0.0.0.0:8888
Now listening on: http://0.0.0.0:9999
Application started. Press Ctrl+C to shut down.
{"Timestamp":"2020-08-26T11:26:16.0643159+00:00","Level":"Information","MessageTemplate":"Handling event of type {type}. KamusSecret {name} in namespace {namespace}","Properties":{"type":"Added","name":"yet-another-test","namespace":"kamus","SourceContext":"CustomResourceDescriptorController.HostedServices.V1Alpha2Controller"}}
{"Timestamp":"2020-08-26T11:26:16.3583423+00:00","Level":"Debug","MessageTemplate":"Starting decrypting KamusSecret items. KamusSecret {name} in namespace {namespace}","Properties":{"name":"yet-another-test","namespace":"kamus","SourceContext":"CustomResourceDescriptorController.HostedServices.V1Alpha2Controller"}}
{"Timestamp":"2020-08-26T11:26:18.4612527+00:00","Level":"Debug","MessageTemplate":"KamusSecret items decrypted successfully. KamusSecret {name} in namespace {namespace}","Properties":{"name":"yet-another-test","namespace":"kamus","SourceContext":"CustomResourceDescriptorController.HostedServices.V1Alpha2Controller"}}
{"Timestamp":"2020-08-26T11:26:19.4608032+00:00","Level":"Error","MessageTemplate":"Error while handling KamusSecret event of type {eventType}, for KamusSecret {name} on namespace {namespace}","Exception":"Microsoft.Rest.HttpOperationException: Operation returned an invalid status code 'Forbidden'\n   at k8s.Kubernetes.CreateNamespacedSecretWithHttpMessagesAsync(V1Secret body, String namespaceParameter, String dryRun, String fieldManager, String pretty, Dictionary`2 customHeaders, CancellationToken cancellationToken)\n   at k8s.KubernetesExtensions.CreateNamespacedSecretAsync(IKubernetes operations, V1Secret body, String namespaceParameter, String dryRun, String fieldManager, String pretty, CancellationToken cancellationToken)\n   at CustomResourceDescriptorController.HostedServices.V1Alpha2Controller.HandleAdd(KamusSecret kamusSecret, Boolean isUpdate) in /app/crd-controller/HostedServices/V1Alpha2Controller.cs:line 167\n   at CustomResourceDescriptorController.HostedServices.V1Alpha2Controller.HandleEvent(WatchEventType event, KamusSecret kamusSecret) in /app/crd-controller/HostedServices/V1Alpha2Controller.cs:line 84","Properties":{"eventType":"Added","name":"yet-another-test","namespace":"kamus","SourceContext":"CustomResourceDescriptorController.HostedServices.V1Alpha2Controller"}}
shaikatz commented 4 years ago

Maybe you can jump into the slack and direct message me so we can solve it faster?

shaikatz commented 4 years ago

Hi @t0st is it still relevant? did you find a solution for your issue?

t0st commented 4 years ago

Hi @shaikatz! Sorry for the late reply. The issue is still present. I'm currently debugging the crd-controller (Ver. 0.8.0.0). The Secret gets decrypted but the process hangs on

var createdSecret =
                await mKubernetes.CreateNamespacedSecretAsync(secret, secret.Metadata.NamespaceProperty);

And I dont't know why it can't create the secret.. the sa has the cluster-role with these rules:

rules:
  - verbs:
      - watch
    apiGroups:
      - soluto.com
    resources:
      - kamussecrets
  - verbs:
      - create
      - delete
      - patch
    apiGroups:
      - ''
    resources:
      - secrets

Can you give me a hint how I have to configure serilog to get the output of mLogger.Debug(...) into my stdout/console?

shaikatz commented 4 years ago

You can run the container with env var : Serilog__MinimumLevel with the value Debug Or if you debugging the code locally you can just edit the appsettings.Development.json file and edit the MinimumLevel key.

ianjuma commented 4 years ago

@shaikatz setting this env Serilog__MinimumLevel on the controller does not appear to work.

      serviceAccountName: kamus-controller
      automountServiceAccountToken: true
      containers:
        - name: controller
          image: soluto/kamus:controller-0.8.0.0
          imagePullPolicy: IfNotPresent
          volumeMounts:
            - name: secret-volume
              mountPath: /home/dotnet/app/secrets
            - name: tls-secret-volume
              mountPath: /home/dotnet/app/tls
          livenessProbe:
            httpGet:
              path: /healthz
              port: 9999
          readinessProbe:
            httpGet:
              path: /healthz
              port: 9999
          resources:
            limits:
              cpu: 500m
              memory: 600Mi
            requests:
              cpu: 100m
              memory: 128Mi
          envFrom:
           - configMapRef:
              name: kamus-controller
          env:
          - name: Serilog__MinimumLevel
            value: Debug
      volumes:
        - name: secret-volume
          secret:
            secretName: kamus
        - name: tls-secret-volume
          secret:
            secretName: kamus-controller
t0st commented 3 years ago

The root cause of my problem was the exception

Operation returned an invalid status code 'Forbidden'\n   at k8s.Kubernetes.CreateNamespacedSecretWithHttpMessagesAsync(V1Secret body

during secret creation.

This is caused by missing access rights to the finalizer of kamussecrets in serviceaccount 'kamus-controller'.

The controller can create the secret successfully after adding this to the role 'kamus-controller':

  - verbs:
      - update
    apiGroups:
      - soluto.com
    resources:
      - kamussecrets/finalizers

I opened Soluto/helm-charts#48 for this fix.

shaikatz commented 3 years ago

Hi thanks for submitting the fix, actually this is an issue only with OpenShift this is why we never encountered that before.

shaikatz commented 3 years ago

Chart fix is done.