aquasecurity / fanal

Static Analysis Library for Containers
Apache License 2.0
199 stars 100 forks source link

Config History Empty Layer not correct #423

Open mcgrawia opened 2 years ago

mcgrawia commented 2 years ago

Hi fanal team,

I'm new to the project but came across a recent issue with Trivy I traced to this library. In short, I am scanning quay.io/argoproj/argocli:v3.1.10 for vulnerabilities, and the history that fanal returns does not match the image's actual history. Specifically, the empty_layers differ. Tracing through fanal, the problematic line is here: https://github.com/aquasecurity/fanal/blob/d775d7b8618aa50e5b3c7904ad12730893b7b8ce/image/daemon/image.go#L156

It appears that my image has a layer with size == 0 but docker does not consider it an empty_layer.

For example, here is the contents of the image's config json history:

  "history": [
    {
      "created": "2021-09-07T15:07:21.592405476Z",
      "created_by": "USER 8737",
      "comment": "buildkit.dockerfile.v0",
      "empty_layer": true
    },
    {
      "created": "2021-09-07T15:07:21.592405476Z",
      "created_by": "WORKDIR /home/argo",
      "comment": "buildkit.dockerfile.v0"
    },
    {
      "created": "2021-09-07T15:07:23.236899848Z",
      "created_by": "COPY hack/ssh_known_hosts /etc/ssh/ # buildkit",
      "comment": "buildkit.dockerfile.v0"
    },
    {
      "created": "2021-09-07T15:07:23.694532894Z",
      "created_by": "COPY hack/nsswitch.conf /etc/ # buildkit",
      "comment": "buildkit.dockerfile.v0"
    },
    {
      "created": "2021-09-10T18:03:19.388604664Z",
      "created_by": "COPY /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ # buildkit",
      "comment": "buildkit.dockerfile.v0"
    },
    {
      "created": "2021-09-10T18:03:19.907283987Z",
      "created_by": "COPY /go/src/github.com/argoproj/argo-workflows/dist/argo /bin/ # buildkit",
      "comment": "buildkit.dockerfile.v0"
    },
    {
      "created": "2021-09-10T18:03:19.907283987Z",
      "created_by": "ENTRYPOINT [\"argo\"]",
      "comment": "buildkit.dockerfile.v0",
      "empty_layer": true
    }
  ],

and here is the history returned by fanal:

      "history": [
        {
          "created": "2021-09-07T15:07:21Z",
          "created_by": "USER 8737",
          "comment": "buildkit.dockerfile.v0",
          "empty_layer": true
        },
        {
          "created": "2021-09-07T15:07:21Z",
          "created_by": "WORKDIR /home/argo",
          "comment": "buildkit.dockerfile.v0",
          "empty_layer": true
        },
        {
          "created": "2021-09-07T15:07:23Z",
          "created_by": "COPY hack/ssh_known_hosts /etc/ssh/ # buildkit",
          "comment": "buildkit.dockerfile.v0"
        },
        {
          "created": "2021-09-07T15:07:23Z",
          "created_by": "COPY hack/nsswitch.conf /etc/ # buildkit",
          "comment": "buildkit.dockerfile.v0"
        },
        {
          "created": "2021-09-10T18:03:19Z",
          "created_by": "COPY /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ # buildkit",
          "comment": "buildkit.dockerfile.v0"
        },
        {
          "created": "2021-09-10T18:03:19Z",
          "created_by": "COPY /go/src/github.com/argoproj/argo-workflows/dist/argo /bin/ # buildkit",
          "comment": "buildkit.dockerfile.v0"
        },
        {
          "created": "2021-09-10T18:03:19Z",
          "created_by": "ENTRYPOINT [\"argo\"]",
          "comment": "buildkit.dockerfile.v0",
          "empty_layer": true
        }
      ],

The difference is fanal counts the second layer with the "WORKDIR /home/argo" command as empty when it should not be. This causes issues downstream when trying to attach layer diff ids back to the history that created them. This image has 5 diff ids but fanal only shows 4 as non empty.

I'm not sure what a proper solution would be here but I did come across the dive project which appears to read the image's config json directly and produces the correct results: https://github.com/wagoodman/dive/blob/c7d121b3d72aeaded26d5731819afaf49b686df6/dive/image/docker/config.go#L18-L45

For now, I have resulted to parsing the image tar file directly to extract the correct empty_layers for my use case. It would be nice to see this fixed natively!

Thanks for looking into it