aquasecurity / trivy-operator

Kubernetes-native security toolkit
https://aquasecurity.github.io/trivy-operator/latest
Apache License 2.0
1.25k stars 208 forks source link

Node-collector stays in Pending state #1610

Closed truongnht closed 11 months ago

truongnht commented 11 months ago

We had an upgrade for trivy-operator from v16.0 to v16.4 (chart version 18.4) using ArgoCD. After the upgrade trivy-operator spins up node-collector. A few went fine, however we have one job which is scheduled on a cordoned node. Due to that, node-collector stays in pending mode, until we terminated it ourselves. I shared here the logs for reference. Dummy question is whether you have excluded cordoned nodes for scheduling, or it is race-condition that leads to this situation?

truongnht commented 11 months ago

After the manual job delete, I notice that node-collector job spins up again and targeting the node which is no longer available.

chen-keinan commented 11 months ago

@truongnht thanks for reporting ,I'll review it and update you

truongnht commented 11 months ago

Hi @chen-keinan , I am curious if you have some result on the investigation?

chen-keinan commented 11 months ago

Hi @chen-keinan , I am curious if you have some result on the investigation?

@truongnht due to kubecon (America) prep and attendance it took longer , I'll get to it next week

chen-keinan commented 11 months ago

@truongnht are you using Karpenter ?

truongnht commented 11 months ago

yes, we are using Karpenter

chen-keinan commented 11 months ago

not Karpenter expert, guessing could be related to topologyKey podAntiAffinity

no sure how to fix it when using karpanter , but you can disable infraassessment as workaround

also suggest to use toleration if possible

cwrau commented 11 months ago

We're running into a similar problem, not using karpenter

But trivy starts a node-collector job selecting a control-plane node but without the necessary toleration

truongnht commented 11 months ago

not Karpenter expert, guessing could be related to topologyKey podAntiAffinity

no sure how to fix it when using karpanter , but you can disable infraassessment as workaround

also suggest to use toleration if possible

@chen-keinan , indeed we took the workaround as disabling infraassessment, however it is better this ticket is fixed

chen-keinan commented 11 months ago

not Karpenter expert, guessing could be related to topologyKey podAntiAffinity no sure how to fix it when using karpanter , but you can disable infraassessment as workaround also suggest to use toleration if possible

@chen-keinan , indeed we took the workaround as disabling infraassessment, however it is better this ticket is fixed

sure , one is not depend the other

cwrau commented 7 months ago

I see this issue has been closed as completed, but this is still happening to us, and I don't know how #1644 was supposed to fix this?

We had to reenable the node-selector, but trivy still creates jobs without tolerations for control-plane nodes, and I assume for nodes tainted for other reasons as well?

Should I open a new issue?

chen-keinan commented 7 months ago

@cwrau sure, please open a new issue and add it details. Note: you can choose if to use node selector as by default the node-collector is running on any deployed Node

cwrau commented 7 months ago

@cwrau sure, please open a new issue and add it details. Note: you can choose if to use node selector as by default the node-collector is running on any deployed Node

Ok?, but how is trivy supposed to collect node info while running on another node? 😅 I see the node-collector job is mounting stuff from the host?

That's why we enabled the node-selector, but using the node-selector the job won't be scheduled on the control-plane, due to taints

chen-keinan commented 7 months ago

@cwrau

That's why we enabled the node-selector, but using the node-selector the job won't be scheduled on the control-plane, due to taints

note sure what is your use-case , it has been requested by other community to have this ability. are you setting toleration? can you provide more info on your use-case ?

cwrau commented 7 months ago

it has been requested by other community to have this ability.

Yeah, and I don't know how they expect this feature to work without the node-selector 😅 Or, is it working without the node-selector? I can't imagine how, as it's mounting stuff from the host and all Infraassessmentreports are the same.

are you setting toleration?

No, is that required? I can find nothing interesting about the node-selector in the docs. And trivy is deciding to launch jobs, I would've assumed that it would also figure out the tolerations to set or skip these nodes.

can you provide more info on your use-case ?

I don't know how to explain it in detail, I just want trivy to scan the nodes 😅

chen-keinan commented 7 months ago

it has been requested by other community to have this ability.

Yeah, and I don't know how they expect this feature to work without the node-selector 😅 Or, is it working without the node-selector? I can't imagine how, as it's mounting stuff from the host and all Infraassessmentreports are the same.

are you setting toleration?

No, is that required? I can find nothing interesting about the node-selector in the docs. And trivy is deciding to launch jobs, I would've assumed that it would also figure out the tolerations to set or skip these nodes.

can you provide more info on your use-case ?

I don't know how to explain it in detail, I just want trivy to scan the nodes 😅

providing more info mean , add your configuration (cm), logs, env type (cloud , on-prem), trivy-operator version, screenshot from stuck pod or anything which could help me to understand what is the problem.

but at 1st I suggest you set toleration if you want the pod the be schedule taint Node.

cwrau commented 7 months ago

it has been requested by other community to have this ability.

Yeah, and I don't know how they expect this feature to work without the node-selector 😅 Or, is it working without the node-selector? I can't imagine how, as it's mounting stuff from the host and all Infraassessmentreports are the same.

are you setting toleration?

No, is that required? I can find nothing interesting about the node-selector in the docs. And trivy is deciding to launch jobs, I would've assumed that it would also figure out the tolerations to set or skip these nodes.

can you provide more info on your use-case ?

I don't know how to explain it in detail, I just want trivy to scan the nodes 😅

providing more info mean , add your configuration (cm)

  compliance.failEntriesLimit: "10"
  configAuditReports.scanner: Trivy
  node.collector.imageRef: ghcr.io/aquasecurity/node-collector:0.1.1
  node.collector.nodeSelector: "true"
  nodeCollector.volumeMounts: '[{"mountPath":"/var/lib/etcd","name":"var-lib-etcd","readOnly":true},{"mountPath":"/var/lib/kubelet","name":"var-lib-kubelet","readOnly":true},{"mountPath":"/var/lib/kube-scheduler","name":"var-lib-kube-scheduler","readOnly":true},{"mountPath":"/var/lib/kube-controller-manager","name":"var-lib-kube-controller-manager","readOnly":true},{"mountPath":"/etc/systemd","name":"etc-systemd","readOnly":true},{"mountPath":"/lib/systemd/","name":"lib-systemd","readOnly":true},{"mountPath":"/etc/kubernetes","name":"etc-kubernetes","readOnly":true},{"mountPath":"/etc/cni/net.d/","name":"etc-cni-netd","readOnly":true}]'
  nodeCollector.volumes: '[{"hostPath":{"path":"/var/lib/etcd"},"name":"var-lib-etcd"},{"hostPath":{"path":"/var/lib/kubelet"},"name":"var-lib-kubelet"},{"hostPath":{"path":"/var/lib/kube-scheduler"},"name":"var-lib-kube-scheduler"},{"hostPath":{"path":"/var/lib/kube-controller-manager"},"name":"var-lib-kube-controller-manager"},{"hostPath":{"path":"/etc/systemd"},"name":"etc-systemd"},{"hostPath":{"path":"/lib/systemd"},"name":"lib-systemd"},{"hostPath":{"path":"/etc/kubernetes"},"name":"etc-kubernetes"},{"hostPath":{"path":"/etc/cni/net.d/"},"name":"etc-cni-netd"}]'
  report.recordFailedChecksOnly: "true"
  scanJob.podTemplateContainerSecurityContext: '{"allowPrivilegeEscalation":false,"capabilities":{"drop":["ALL"]},"privileged":false,"readOnlyRootFilesystem":true,"runAsGroup":10000,"runAsNonRoot":true,"runAsUser":10000}'
  scanJob.podTemplatePodSecurityContext: '{"seccompProfile":{"type":"RuntimeDefault"}}'
  vulnerabilityReports.scanner: Trivy

, logs

No helpful logs, only that the node is found and the job is getting scheduled;

DEBUG   node-reconciler Getting node from cache {"node": {"name":"1111-teuto-scan-2207-control-plane-l42ss-gj8w9"}}                                        
DEBUG   node-reconciler Checking whether cluster Infra assessments report exists        {"node": {"name":"1111-teuto-scan-2207-control-plane-l42ss-gj8w9"}}
DEBUG   node-reconciler Checking whether Node info collector job have been scheduled    {"node": {"name":"1111-teuto-scan-2207-control-plane-l42ss-gj8w9"}}
DEBUG   node-reconciler Checking node collector jobs limit      {"node": {"name":"1111-teuto-scan-2207-control-plane-l42ss-gj8w9"}, "count": 0, "limit": 3}
DEBUG   node-reconciler Scheduling Node collector job   {"node": {"name":"1111-teuto-scan-2207-control-plane-l42ss-gj8w9"}}                                

, env type (cloud , on-prem)

private cloud -> on-prem

, trivy-operator version

0.18.5

, screenshot from stuck pod or anything which could help me to understand what is the problem.

The job;

apiVersion: batch/v1
kind: Job
metadata:
  annotations:
    batch.kubernetes.io/job-tracking: ""
  creationTimestamp: "2024-03-12T09:34:41Z"
  generation: 1
  labels:
    app.kubernetes.io/managed-by: trivy-operator
    node-info.collector: Trivy
    trivy-operator.resource.kind: Node
    trivy-operator.resource.name: 1111-teuto-scan-2207-control-plane-l42ss-gj8w9
  name: node-collector-756ffb6f47
  namespace: trivy
  resourceVersion: "487568794"
  uid: 13954e28-513e-47d9-b563-4ca968cc06b0
spec:
  activeDeadlineSeconds: 900
  backoffLimit: 0
  completionMode: NonIndexed
  completions: 1
  parallelism: 1
  selector:
    matchLabels:
      batch.kubernetes.io/controller-uid: 13954e28-513e-47d9-b563-4ca968cc06b0
  suspend: false
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: node-collector
        batch.kubernetes.io/controller-uid: 13954e28-513e-47d9-b563-4ca968cc06b0
        batch.kubernetes.io/job-name: node-collector-756ffb6f47
        controller-uid: 13954e28-513e-47d9-b563-4ca968cc06b0
        job-name: node-collector-756ffb6f47
    spec:
      automountServiceAccountToken: true
      containers:
      - args:
        - k8s
        - --node
        - 1111-teuto-scan-2207-control-plane-l42ss-gj8w9
        command:
        - node-collector
        image: ghcr.io/aquasecurity/node-collector:0.1.1
        imagePullPolicy: IfNotPresent
        name: node-collector
        resources:
          limits:
            cpu: 100m
            memory: 100M
          requests:
            cpu: 50m
            memory: 50M
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            drop:
            - ALL
          privileged: false
          readOnlyRootFilesystem: true
          runAsGroup: 10000
          runAsNonRoot: true
          runAsUser: 10000
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /var/lib/etcd
          name: var-lib-etcd
          readOnly: true
        - mountPath: /var/lib/kubelet
          name: var-lib-kubelet
          readOnly: true
        - mountPath: /var/lib/kube-scheduler
          name: var-lib-kube-scheduler
          readOnly: true
        - mountPath: /var/lib/kube-controller-manager
          name: var-lib-kube-controller-manager
          readOnly: true
        - mountPath: /etc/systemd
          name: etc-systemd
          readOnly: true
        - mountPath: /lib/systemd/
          name: lib-systemd
          readOnly: true
        - mountPath: /etc/kubernetes
          name: etc-kubernetes
          readOnly: true
        - mountPath: /etc/cni/net.d/
          name: etc-cni-netd
          readOnly: true
      dnsPolicy: ClusterFirst
      hostPID: true
      nodeSelector:
        kubernetes.io/hostname: 1111-teuto-scan-2207-control-plane-l42ss-gj8w9
      restartPolicy: Never
      schedulerName: default-scheduler
      securityContext:
        seccompProfile:
          type: RuntimeDefault
      serviceAccount: trivy-trivy-operator
      serviceAccountName: trivy-trivy-operator
      terminationGracePeriodSeconds: 30
      volumes:
      - hostPath:
          path: /var/lib/etcd
          type: ""
        name: var-lib-etcd
      - hostPath:
          path: /var/lib/kubelet
          type: ""
        name: var-lib-kubelet
      - hostPath:
          path: /var/lib/kube-scheduler
          type: ""
        name: var-lib-kube-scheduler
      - hostPath:
          path: /var/lib/kube-controller-manager
          type: ""
        name: var-lib-kube-controller-manager
      - hostPath:
          path: /etc/systemd
          type: ""
        name: etc-systemd
      - hostPath:
          path: /lib/systemd
          type: ""
        name: lib-systemd
      - hostPath:
          path: /etc/kubernetes
          type: ""
        name: etc-kubernetes
      - hostPath:
          path: /etc/cni/net.d/
          type: ""
        name: etc-cni-netd
status:
  active: 1
  ready: 0
  startTime: "2024-03-12T09:34:41Z"
  uncountedTerminatedPods: {}

The pod;

apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: "2024-03-12T09:34:41Z"
  finalizers:
  - batch.kubernetes.io/job-tracking
  generateName: node-collector-756ffb6f47-
  labels:
    app: node-collector
    batch.kubernetes.io/controller-uid: 13954e28-513e-47d9-b563-4ca968cc06b0
    batch.kubernetes.io/job-name: node-collector-756ffb6f47
    controller-uid: 13954e28-513e-47d9-b563-4ca968cc06b0
    job-name: node-collector-756ffb6f47
  name: node-collector-756ffb6f47-jpvvl
  namespace: trivy
  ownerReferences:
  - apiVersion: batch/v1
    blockOwnerDeletion: true
    controller: true
    kind: Job
    name: node-collector-756ffb6f47
    uid: 13954e28-513e-47d9-b563-4ca968cc06b0
  resourceVersion: "487568797"
  uid: addfcdb8-0182-4b5b-ad96-4b7cb2933494
spec:
  automountServiceAccountToken: true
  containers:
  - args:
    - k8s
    - --node
    - 1111-teuto-scan-2207-control-plane-l42ss-gj8w9
    command:
    - node-collector
    image: ghcr.io/aquasecurity/node-collector:0.1.1
    imagePullPolicy: Always
    name: node-collector
    resources:
      limits:
        cpu: 100m
        memory: 100M
      requests:
        cpu: 50m
        memory: 50M
    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        drop:
        - ALL
      privileged: false
      readOnlyRootFilesystem: true
      runAsGroup: 10000
      runAsNonRoot: true
      runAsUser: 10000
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /var/lib/etcd
      name: var-lib-etcd
      readOnly: true
    - mountPath: /var/lib/kubelet
      name: var-lib-kubelet
      readOnly: true
    - mountPath: /var/lib/kube-scheduler
      name: var-lib-kube-scheduler
      readOnly: true
    - mountPath: /var/lib/kube-controller-manager
      name: var-lib-kube-controller-manager
      readOnly: true
    - mountPath: /etc/systemd
      name: etc-systemd
      readOnly: true
    - mountPath: /lib/systemd/
      name: lib-systemd
      readOnly: true
    - mountPath: /etc/kubernetes
      name: etc-kubernetes
      readOnly: true
    - mountPath: /etc/cni/net.d/
      name: etc-cni-netd
      readOnly: true
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-nn2j7
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  hostPID: true
  nodeSelector:
    kubernetes.io/hostname: 1111-teuto-scan-2207-control-plane-l42ss-gj8w9
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  restartPolicy: Never
  schedulerName: default-scheduler
  securityContext:
    seccompProfile:
      type: RuntimeDefault
  serviceAccount: trivy-trivy-operator
  serviceAccountName: trivy-trivy-operator
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - hostPath:
      path: /var/lib/etcd
      type: ""
    name: var-lib-etcd
  - hostPath:
      path: /var/lib/kubelet
      type: ""
    name: var-lib-kubelet
  - hostPath:
      path: /var/lib/kube-scheduler
      type: ""
    name: var-lib-kube-scheduler
  - hostPath:
      path: /var/lib/kube-controller-manager
      type: ""
    name: var-lib-kube-controller-manager
  - hostPath:
      path: /etc/systemd
      type: ""
    name: etc-systemd
  - hostPath:
      path: /lib/systemd
      type: ""
    name: lib-systemd
  - hostPath:
      path: /etc/kubernetes
      type: ""
    name: etc-kubernetes
  - hostPath:
      path: /etc/cni/net.d/
      type: ""
    name: etc-cni-netd
  - name: kube-api-access-nn2j7
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          expirationSeconds: 3607
          path: token
      - configMap:
          items:
          - key: ca.crt
            path: ca.crt
          name: kube-root-ca.crt
      - downwardAPI:
          items:
          - fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
            path: namespace
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2024-03-12T09:34:41Z"
    message: '0/6 nodes are available: 3 node(s) didn''t match Pod''s node affinity/selector,
      3 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }. preemption:
      0/6 nodes are available: 6 Preemption is not helpful for scheduling..'
    reason: Unschedulable
    status: "False"
    type: PodScheduled
  phase: Pending
  qosClass: Burstable

but at 1st I suggest you set toleration if you want the pod the be schedule taint Node.

Yeah, that would be a short-term solution


But, I think this is a bigger problem. You removed the need for node-selectors for the Clusterinfraassessmentreports, but that just doesn't make sense. trivy can't scan node A while being scheduled on node B.

Not enabling the node-selector completely invalidates the Clusterinfraassessmentreports for nodes and gives a false sense of security/problems.

I might see some problems on "node A", search for hours where it's coming from, how to update it, why trivy thinks that's the case even though I see it differently on the server, just to realize that trivy actually scanned node B and saved it under node A.

The real, long-term solution should be to always enable the node-selector and get the taints for each node, maybe check if they should be ignored,, e.g. for control-planes,, random taints,, stuff like that but not "real" taints like the one from cordon,, , and create tolerations from that.

chen-keinan commented 7 months ago

@cwrau the param for use node-selector is enable by default. you can choose not to use it (by configuration) so in-term of scan every node will be collected by node-collector

cwrau commented 7 months ago

@cwrau the param for use node-selector is enable by default. you can choose not to use it (by configuration) so in-term of scan every node will be collected by node-collector

Ah, perfect, then the only missing part would be the tolerations.

Or, if trivy doesn't want to add tolerations by itself, it shouldn't try to schedule jobs for nodes with taints (that aren't covered by the tolerations)

chen-keinan commented 7 months ago

@cwrau the param for use node-selector is enable by default. you can choose not to use it (by configuration) so in-term of scan every node will be collected by node-collector

Ah, perfect, then the only missing part would be the tolerations.

Or, if trivy doesn't want to add tolerations by itself, it shouldn't try to schedule jobs for nodes with taints (that aren't covered by the tolerations)

this could be enhancements

ltdeoliveira commented 5 months ago

Any updates on this?

chen-keinan commented 5 months ago

@ltdeoliveira can be easily fixed with adding toleration to node-collector scan job

ltdeoliveira commented 5 months ago

@ltdeoliveira can be easily fixed with adding toleration to node-collector scan job

@chen-keinan Could you please provide an example? I'm installing the operator with the Helm chart.

chen-keinan commented 5 months ago

@ltdeoliveira with latest trivy version you need to define tolerations here

if you are using older versions of trivy-operator, it should be configure here

saarw-opti commented 4 months ago

@ltdeoliveira with latest trivy version you need to define tolerations here

if you are using older versions of trivy-operator, it should be configure here

It's not working for me when I've added the tolerations, should I add more things?

chen-keinan commented 4 months ago

@ltdeoliveira make sure nodeAffinity is not workin gin conjunction with tolerations

Shirueopseo commented 2 months ago

I'm also having this problem with karpenter nodes, even setting the tolerations and made sure there aren't affinities, node-controller job and pod try to deploy on fargate profiles (even forcing to use nodeSelector with a karpenter node, which seems to work with trivy-operator but not for node-controller). I has to disable infraassessment. Any update about this?