Identify image vulnerabilities in Kubernetes pods
Use scanning from a different registry #28

aetomala commented 4 years ago

I have installed CSO through the OperatorHub on an OpenShift 4.3 instance. I also have a private QUAY 3.2 instance configured to use CLAIR for scanning. I noticed that this operator only provides scan vulnerabilities on images from quay.io registry and no my private quay repository. Is there something I need to do to the operator config or my registry such that the SCO reports scan issues from my quay repo as well ?

BillDett commented 4 years ago

The operator should detect images from either quay.io or an on-premise Quay installation. We designed it for no user config needed. Can you share your Pod manifest here so we can see what's going on?

aetomala commented 4 years ago

baddocker manifest kind: Pod apiVersion: v1 metadata: name: baddocker namespace: default

baddocker2 manifest kind: Pod apiVersion: v1 metadata: name: baddocker2 namespace: default

aetomala commented 4 years ago

Also here is the SCO log at the point that I deploy an image from our private repo. Notice the No manifest security capabilities"

level=debug msg="Pod added" key=default/baddocker
E0129 18:01:06.195474       1 labeller.go:191] default/baddocker failed with : &{%!w(string=Pod phase not running: Pending)}
level=info msg="Requeued item" key=default/baddocker
E0129 18:01:06.204532       1 labeller.go:191] default/baddocker failed with : &{%!w(string=Pod phase not running: Pending)}
level=info msg="Requeued item" key=default/baddocker
level=debug msg="Pod updated" key=default/baddocker
E0129 18:01:06.208097       1 labeller.go:191] default/baddocker failed with : &{%!w(string=Pod phase not running: Pending)}
level=info msg="Requeued item" key=default/baddocker
E0129 18:01:06.215373       1 labeller.go:191] default/baddocker failed with : &{%!w(string=Pod phase not running: Pending)}
level=info msg="Requeued item" key=default/baddocker
level=debug msg="Pod updated" key=default/baddocker
E0129 18:01:06.242979       1 labeller.go:191] default/baddocker failed with : &{%!w(string=Pod phase not running: Pending)}
level=info msg="Requeued item" key=default/baddocker
E0129 18:01:06.255677       1 labeller.go:191] default/baddocker failed with : &{%!w(string=Pod phase not running: Pending)}
level=info msg="Requeued item" key=default/baddocker
E0129 18:01:06.416010       1 labeller.go:191] default/baddocker failed with : &{%!w(string=Pod phase not running: Pending)}
level=info msg="Requeued item" key=default/baddocker
E0129 18:01:06.736529       1 labeller.go:191] default/baddocker failed with : &{%!w(string=Pod phase not running: Pending)}
level=info msg="Requeued item" key=default/baddocker
level=debug msg="Pod updated" key=default/baddocker
E0129 18:01:07.142644       1 labeller.go:191] default/baddocker failed with : &{%!w(string=Pod phase not running: Pending)}
level=info msg="Requeued item" key=default/baddocker
E0129 18:01:07.379413       1 labeller.go:191] default/baddocker failed with : &{%!w(string=Pod phase not running: Pending)}
level=info msg="Requeued item" key=default/baddocker
level=debug msg="Pod updated" key=default/baddocker
level=info msg="Garbage collecting unreferenced ImageManifestVulns" key=default/baddocker
level=error msg="Failed to sync layer data" key=default/baddocker err="No manifest security capabilities"
level=info msg="Garbage collecting unreferenced ImageManifestVulns" key=default/baddocker
level=error msg="Failed to sync layer data" key=default/baddocker err="No manifest security capabilities"
kleesc commented 4 years ago

@aetomala can you make sure that the pod in which the CSO is running has access to your Quay instance. Specifically, the CSO will use Quay's /.well-known/app-capabilities to do discovery.

aetomala commented 4 years ago

@kleesc The pod was deployed in OpenShiftS 4.3 using the OperatorHub Catalog. When I hit install, there is no option to update anything about the container that is about to be deployed, other than which namespaces I want to monitor. When I look at the yaml for the deployed operator, I don't see how the configuration you are suggesting can be injected. I am not sure if this is the desired behavior from RedHat, but would you take a look at the yaml for the pod and suggest what changes I need to make?

kind: Deployment
apiVersion: apps/v1
    deployment.kubernetes.io/revision: '1'
  selfLink: >-
  resourceVersion: '365892'
  name: container-security-operator
  uid: e46dea8f-cf9d-4c7c-b484-a25ca4c35865
  creationTimestamp: '2020-01-29T16:14:42Z'
  generation: 1
  namespace: openshift-operators
    - apiVersion: operators.coreos.com/v1alpha1
      kind: ClusterServiceVersion
      name: container-security-operator.v1.0.1
      uid: 6c1c2e9c-7168-48b4-9d33-407c245a4009
      controller: false
      blockOwnerDeletion: false
    olm.owner: container-security-operator.v1.0.1
    olm.owner.kind: ClusterServiceVersion
    olm.owner.namespace: openshift-operators
  replicas: 1
      name: container-security-operator-alm-owned
      name: container-security-operator-alm-owned
      creationTimestamp: null
        name: container-security-operator-alm-owned
        tectonic-visibility: ocs
        olm.targetNamespaces: ''
        repository: 'https://github.com/quay/container-security-operator'
        alm-examples: |-
              "apiVersion": "secscan.quay.redhat.com/v1alpha1",
              "kind": "ImageManifestVuln",
              "metadata": {
                "name": "example"
              "spec": {}
        capabilities: Full Lifecycle
        olm.operatorNamespace: openshift-operators
        containerImage: >-
        createdAt: '2019-11-16 01:03:00'
        categories: Security
        description: Identify image vulnerabilities in Kubernetes pods
        olm.operatorGroup: global-operators
        - name: container-security-operator
          image: >-
            - /bin/security-labeller
            - '--namespaces=$(WATCH_NAMESPACE)'
            - name: MY_POD_NAMESPACE
                  apiVersion: v1
                  fieldPath: metadata.namespace
            - name: MY_POD_NAME
                  apiVersion: v1
                  fieldPath: metadata.name
            - name: WATCH_NAMESPACE
                  apiVersion: v1
                  fieldPath: 'metadata.annotations[''olm.targetNamespaces'']'
          resources: {}
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          imagePullPolicy: IfNotPresent
      restartPolicy: Always
      terminationGracePeriodSeconds: 30
      dnsPolicy: ClusterFirst
      serviceAccountName: container-security-operator
      serviceAccount: container-security-operator
      securityContext: {}
      schedulerName: default-scheduler
    type: RollingUpdate
      maxUnavailable: 25%
      maxSurge: 25%
  revisionHistoryLimit: 10
  progressDeadlineSeconds: 600
  observedGeneration: 1
  replicas: 1
  updatedReplicas: 1
  readyReplicas: 1
  availableReplicas: 1
    - type: Available
      status: 'True'
      lastUpdateTime: '2020-01-29T16:14:46Z'
      lastTransitionTime: '2020-01-29T16:14:46Z'
      reason: MinimumReplicasAvailable
      message: Deployment has minimum availability.
    - type: Progressing
      status: 'True'
      lastUpdateTime: '2020-01-29T16:14:46Z'
      lastTransitionTime: '2020-01-29T16:14:42Z'
      reason: NewReplicaSetAvailable
      message: >-
        ReplicaSet "container-security-operator-7b8fbb6f6" has successfully

the pod yaml

kind: Pod
apiVersion: v1
  generateName: container-security-operator-7b8fbb6f6-
    tectonic-visibility: ocs
    olm.targetNamespaces: ''
    repository: 'https://github.com/quay/container-security-operator'
    alm-examples: |-
          "apiVersion": "secscan.quay.redhat.com/v1alpha1",
          "kind": "ImageManifestVuln",
          "metadata": {
            "name": "example"
          "spec": {}
    capabilities: Full Lifecycle
    olm.operatorNamespace: openshift-operators
    containerImage: >-
    createdAt: '2019-11-16 01:03:00'
    categories: Security
    description: Identify image vulnerabilities in Kubernetes pods
    olm.operatorGroup: global-operators
  selfLink: >-
  resourceVersion: '365889'
  name: container-security-operator-7b8fbb6f6-lvs5w
  uid: ffd5e9ca-1108-455f-a7b7-3b53f1c51733
  creationTimestamp: '2020-01-29T16:14:42Z'
  namespace: openshift-operators
    - apiVersion: apps/v1
      kind: ReplicaSet
      name: container-security-operator-7b8fbb6f6
      uid: e78279c5-4925-4270-a247-6c94797e4695
      controller: true
      blockOwnerDeletion: true
    name: container-security-operator-alm-owned
    pod-template-hash: 7b8fbb6f6
  restartPolicy: Always
  serviceAccountName: container-security-operator
    - name: container-security-operator-dockercfg-7bh4t
  priority: 0
  schedulerName: default-scheduler
  enableServiceLinks: true
  terminationGracePeriodSeconds: 30
  securityContext: {}
    - resources: {}
      terminationMessagePath: /dev/termination-log
      name: container-security-operator
        - /bin/security-labeller
        - '--namespaces=$(WATCH_NAMESPACE)'
        - name: MY_POD_NAMESPACE
              apiVersion: v1
              fieldPath: metadata.namespace
        - name: MY_POD_NAME
              apiVersion: v1
              fieldPath: metadata.name
        - name: WATCH_NAMESPACE
              apiVersion: v1
              fieldPath: 'metadata.annotations[''olm.targetNamespaces'']'
      imagePullPolicy: IfNotPresent
        - name: container-security-operator-token-6jr9p
          readOnly: true
          mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      terminationMessagePolicy: File
      image: >-
  serviceAccount: container-security-operator
    - name: container-security-operator-token-6jr9p
        secretName: container-security-operator-token-6jr9p
        defaultMode: 420
  dnsPolicy: ClusterFirst
    - key: node.kubernetes.io/not-ready
      operator: Exists
      effect: NoExecute
      tolerationSeconds: 300
    - key: node.kubernetes.io/unreachable
      operator: Exists
      effect: NoExecute
      tolerationSeconds: 300
  phase: Running
    - type: Initialized
      status: 'True'
      lastProbeTime: null
      lastTransitionTime: '2020-01-29T16:14:42Z'
    - type: Ready
      status: 'True'
      lastProbeTime: null
      lastTransitionTime: '2020-01-29T16:14:46Z'
    - type: ContainersReady
      status: 'True'
      lastProbeTime: null
      lastTransitionTime: '2020-01-29T16:14:46Z'
    - type: PodScheduled
      status: 'True'
      lastProbeTime: null
      lastTransitionTime: '2020-01-29T16:14:42Z'
    - ip:
  startTime: '2020-01-29T16:14:42Z'
    - restartCount: 0
      started: true
      ready: true
      name: container-security-operator
          startedAt: '2020-01-29T16:14:46Z'
      imageID: >-
      image: >-
      lastState: {}
      containerID: 'cri-o://ceeda131092cd7a01bb983c0d5b93a9f61edf0433e1eac54193bc69258984a6c'
  qosClass: BestEffort
aetomala commented 4 years ago

@kleesc If I read the code correctly (1.0.1 release) /.well-known/app-capabilities is already the default value for wellknownEndpoint. What is not clear through the documentation is what that file /.well-known/app-capabilities is supposed to look like or that value can be set to the actual hostname. When I ssh into the pod I cannot locate /.well-known/app-capabilities. This lead me to think that the file needs to be mounted to the pod. Would you clarify those two points?

Lastly, I noticed that in your master for for this project, the security-labeller now makes use of another attribute scanner host which makes more sense. https://github.com/jjmengze/container-security-operator/blob/master/cmd/security-labeller/main.go

kleesc commented 4 years ago

@aetomala /.well-known/app-capabilities is the discovery endpoint on Quay itself, and not a file. e.g https://some-quay-host/.well-known/app-capabilities. The CSO infers the registry an image is pulled from from the pods' ImageID it's trying to scan.

The pod in which the CSO is running needs access to that endpoint mentioned above. One way to check would be to SSH in the CSO's pod, and try curling that endpoint from that pod instead. My guess is that since it's working on quay.io images and not your private Quay instance, it has to be something with the CSO being able to reach your private registry.

gorantornqvist commented 4 years ago

Sorry for hijacking. But having the same issue. Looked at "Example config" and a bit unclear how and where this should be configured.

I would like to get results from both quay.io and my local quay registry if possible ...

aetomala commented 4 years ago

@kleesc I ssh'ed into the CSO pod and I was able to ping and do wget to host https://quay-enterprise-quay-enterprise.quay-enterprise-stg-7d4bdc08e7ddc90fa89b373d95c240eb-0001.us-east.containers.appdomain.cloud I cleaned all pods and added a bad pod from my private registry and no ImageManifestVuln is created or reported. see errors below and notice that it says my registry does not support that capability. However, if delete all of the pods, then deploy your pod example, the CSO creates an ImageManifestVuln (remember this is the same exact image as the one your your example, only hosted in a different registry). If I now deploy the image from my registry, after the one from your example has been deployed, CSO correctly matches the sha and annotates my the Vulnerability record with my newly created pod (from my repo). This is the logs when I only deploy image from private repo

level=debug msg="Pod added" key=default/badpod
E0131 16:49:58.835722       1 labeller.go:191] default/badpod failed with : &{%!w(string=Pod phase not running: Pending)}
level=info msg="Requeued item" key=default/badpod
E0131 16:49:58.841706       1 labeller.go:191] default/badpod failed with : &{%!w(string=Pod phase not running: Pending)}
level=info msg="Requeued item" key=default/badpod
level=debug msg="Pod updated" key=default/badpod
E0131 16:49:58.842855       1 labeller.go:191] default/badpod failed with : &{%!w(string=Pod phase not running: Pending)}
level=info msg="Requeued item" key=default/badpod
E0131 16:49:58.851941       1 labeller.go:191] default/badpod failed with : &{%!w(string=Pod phase not running: Pending)}
level=info msg="Requeued item" key=default/badpod
level=debug msg="Pod updated" key=default/badpod
E0131 16:49:58.872767       1 labeller.go:191] default/badpod failed with : &{%!w(string=Pod phase not running: Pending)}
level=info msg="Requeued item" key=default/badpod
E0131 16:49:58.892308       1 labeller.go:191] default/badpod failed with : &{%!w(string=Pod phase not running: Pending)}
level=info msg="Requeued item" key=default/badpod
E0131 16:49:59.052634       1 labeller.go:191] default/badpod failed with : &{%!w(string=Pod phase not running: Pending)}
level=info msg="Requeued item" key=default/badpod
E0131 16:49:59.373750       1 labeller.go:191] default/badpod failed with : &{%!w(string=Pod phase not running: Pending)}
level=info msg="Requeued item" key=default/badpod
level=debug msg="Pod updated" key=default/badpod
E0131 16:49:59.723908       1 labeller.go:191] default/badpod failed with : &{%!w(string=Pod phase not running: Pending)}
level=info msg="Requeued item" key=default/badpod
E0131 16:50:00.014145       1 labeller.go:191] default/badpod failed with : &{%!w(string=Pod phase not running: Pending)}
level=info msg="Requeued item" key=default/badpod
level=debug msg="Pod updated" key=default/badpod
level=info msg="Garbage collecting unreferenced ImageManifestVulns" key=default/badpod
>>>>>level=error msg="Failed to sync layer data" key=default/badpod err="No manifest security capabilities"
level=info msg="Garbage collecting unreferenced ImageManifestVulns" key=default/badpod
>>>>>level=error msg="Failed to sync layer data" key=default/badpod err="No manifest security capabilities"
level=info msg="Removing deleted pod from ImageManifestVulns" key=default/badpodtest
level=info msg="Garbage collecting unreferenced ImageManifestVulns" key=default/badpodtest

these are the logs when I deploy the images in the sequence I described above

level=debug msg="Pod added" key=default/high
E0131 19:24:12.111854       1 labeller.go:191] default/high failed with : &{%!w(string=Pod phase not running: Pending)}
level=info msg="Requeued item" key=default/high
E0131 19:24:12.117057       1 labeller.go:191] default/high failed with : &{%!w(string=Pod phase not running: Pending)}
level=info msg="Requeued item" key=default/high
level=debug msg="Pod updated" key=default/high
E0131 19:24:12.121607       1 labeller.go:191] default/high failed with : &{%!w(string=Pod phase not running: Pending)}
level=info msg="Requeued item" key=default/high
E0131 19:24:12.127339       1 labeller.go:191] default/high failed with : &{%!w(string=Pod phase not running: Pending)}
level=info msg="Requeued item" key=default/high
level=debug msg="Pod updated" key=default/high
E0131 19:24:12.147678       1 labeller.go:191] default/high failed with : &{%!w(string=Pod phase not running: Pending)}
level=info msg="Requeued item" key=default/high
E0131 19:24:12.167683       1 labeller.go:191] default/high failed with : &{%!w(string=Pod phase not running: Pending)}
level=info msg="Requeued item" key=default/high
E0131 19:24:12.328715       1 labeller.go:191] default/high failed with : &{%!w(string=Pod phase not running: Pending)}
level=info msg="Requeued item" key=default/high
E0131 19:24:12.648973       1 labeller.go:191] default/high failed with : &{%!w(string=Pod phase not running: Pending)}
level=info msg="Requeued item" key=default/high
level=debug msg="Pod updated" key=default/high
E0131 19:24:12.985389       1 labeller.go:191] default/high failed with : &{%!w(string=Pod phase not running: Pending)}
level=info msg="Requeued item" key=default/high
E0131 19:24:13.289254       1 labeller.go:191] default/high failed with : &{%!w(string=Pod phase not running: Pending)}
level=info msg="Requeued item" key=default/high
level=debug msg="Pod updated" key=default/high
level=info msg="Garbage collecting unreferenced ImageManifestVulns" key=default/high
level=info msg="Created ImageManifestVuln" manifestKey=default/sha256.ca908f415a15fdba408f82537d295350772afa985112ee62db6709fea994a682 key=default/high
level=debug msg="ImageManifestVuln added" key=default/sha256.ca908f415a15fdba408f82537d295350772afa985112ee62db6709fea994a682
level=debug msg="ImageManifestVuln updated" key=default/sha256.ca908f415a15fdba408f82537d295350772afa985112ee62db6709fea994a682
level=info msg="Garbage collecting unreferenced ImageManifestVulns" key=default/high
level=debug msg="Pod added" key=default/badpod
E0131 19:26:31.150294       1 labeller.go:191] default/badpod failed with : &{%!w(string=Pod phase not running: Pending)}
level=info msg="Requeued item" key=default/badpod
E0131 19:26:31.155528       1 labeller.go:191] default/badpod failed with : &{%!w(string=Pod phase not running: Pending)}
level=info msg="Requeued item" key=default/badpod
level=debug msg="Pod updated" key=default/badpod
E0131 19:26:31.162084       1 labeller.go:191] default/badpod failed with : &{%!w(string=Pod phase not running: Pending)}
level=info msg="Requeued item" key=default/badpod
E0131 19:26:31.165837       1 labeller.go:191] default/badpod failed with : &{%!w(string=Pod phase not running: Pending)}
level=info msg="Requeued item" key=default/badpod
level=debug msg="Pod updated" key=default/badpod
E0131 19:26:31.189909       1 labeller.go:191] default/badpod failed with : &{%!w(string=Pod phase not running: Pending)}
level=info msg="Requeued item" key=default/badpod
E0131 19:26:31.206051       1 labeller.go:191] default/badpod failed with : &{%!w(string=Pod phase not running: Pending)}
level=info msg="Requeued item" key=default/badpod
E0131 19:26:31.366321       1 labeller.go:191] default/badpod failed with : &{%!w(string=Pod phase not running: Pending)}
level=info msg="Requeued item" key=default/badpod
E0131 19:26:31.686625       1 labeller.go:191] default/badpod failed with : &{%!w(string=Pod phase not running: Pending)}
level=info msg="Requeued item" key=default/badpod
level=debug msg="Pod updated" key=default/badpod
E0131 19:26:32.070128       1 labeller.go:191] default/badpod failed with : &{%!w(string=Pod phase not running: Pending)}
level=info msg="Requeued item" key=default/badpod
E0131 19:26:32.330925       1 labeller.go:191] default/badpod failed with : &{%!w(string=Pod phase not running: Pending)}
level=info msg="Requeued item" key=default/badpod
level=debug msg="Pod updated" key=default/badpod
level=info msg="Garbage collecting unreferenced ImageManifestVulns" key=default/badpod
level=debug msg="ImageManifestVuln updated" key=default/sha256.ca908f415a15fdba408f82537d295350772afa985112ee62db6709fea994a682
level=debug msg="ImageManifestVuln updated" key=default/sha256.ca908f415a15fdba408f82537d295350772afa985112ee62db6709fea994a682
level=info msg="Garbage collecting unreferenced ImageManifestVulns" key=default/badpod
level=debug msg="Pod updated" key=default/badpod
level=debug msg="Pod updated" key=openshift-console/downloads-75b97dcb56-h8sr4
level=info msg="Garbage collecting unreferenced ImageManifestVulns" key=default/badpod
mransonwang commented 4 years ago

After researched source code, the quay.io was hardcode in source code, so it's no possible to change quay.io to point to another private registry, look forward to enhance the function to support on-premise quay registry