buildpacks / pack

CLI for building apps using Cloud Native Buildpacks
https://buildpacks.io
Apache License 2.0
2.56k stars 286 forks source link

create-builder and package-buildpack should "zero" timestamps within layers #447

Open ekcasey opened 4 years ago

ekcasey commented 4 years ago

Description

pack create-builder and pack package-buildpack should create layers where all of the timestamps (mod, access, change) are "zeroed out" (aka set to Jan. 1 1980).

Proposed solution

Normal times for all tar entries when creating layers. The lifecycle exporter currently does this for app image layers.

Additional context

This decreases the amount of storage required for builders. For example, if a new version of a builder is created containing some of the same buildpacks as a previous builder image, those buildpack layers should be reusable. Currently, immaterial timestamp differences may prevent this from happening.

natalieparellano commented 4 years ago

@zmackie we removed the roadmap/buildpackages label because we didn't see the association, but feel free to add it back if we're wrong

cc @jromero

natalieparellano commented 4 years ago

We observed some inconsistencies in how directory mod time is being set today - for example, /cnb doesn't have zeroed out mod time, whereas /layers does.

On the Windows side, Windows tar layers expect all parent directories to be explicitly created prior to the child entry being written.

Not sure how relevant this actually is for this issue, but could be worth keeping in mind.

simonjjones commented 4 years ago

Current timestamps for a freshly built alpine sample builder:

$ stat layers
  File: layers
  Size: 4096            Blocks: 8          IO Block: 4096   directory
Device: 70h/112d        Inode: 2098150     Links: 2
Access: (0755/drwxr-xr-x)  Uid: ( 1000/     cnb)   Gid: ( 1001/     cnb)
Access: 2020-03-05 17:10:22.000000000
Modify: 1980-01-01 00:00:01.000000000
Change: 2020-03-05 17:10:22.000000000

$ stat cnb
  File: cnb
  Size: 4096            Blocks: 8          IO Block: 4096   directory
Device: 70h/112d        Inode: 2098257     Links: 1
Access: (0755/drwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2020-03-05 17:10:23.000000000
Modify: 2020-03-05 17:10:23.000000000
Change: 2020-03-05 17:10:23.000000000

$ stat cnb/buildpacks/
  File: cnb/buildpacks/
  Size: 4096            Blocks: 8          IO Block: 4096   directory
Device: 70h/112d        Inode: 2098235     Links: 1
Access: (0755/drwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2020-03-05 17:10:22.000000000
Modify: 2020-03-05 17:10:22.000000000
Change: 2020-03-05 17:10:22.000000000

$ stat cnb/lifecycle/
  File: cnb/lifecycle/
  Size: 4096            Blocks: 8          IO Block: 4096   directory
Device: 70h/112d        Inode: 2098162     Links: 2
Access: (0755/drwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2020-03-05 17:10:22.000000000
Modify: 1980-01-01 00:00:01.000000000
Change: 2020-03-05 17:10:22.000000000

$ stat cnb/lifecycle/analyzer
  File: cnb/lifecycle/analyzer
  Size: 10412032        Blocks: 20336      IO Block: 4096   regular file
Device: 70h/112d        Inode: 2098166     Links: 1
Access: (0755/-rwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2020-01-30 16:28:08.000000000
Modify: 2020-01-30 16:28:08.000000000
Change: 2020-03-05 17:10:22.000000000

$ stat cnb/buildpacks/io.buildpacks.samples.hello-moon/
  File: cnb/buildpacks/io.buildpacks.samples.hello-moon/
  Size: 4096            Blocks: 8          IO Block: 4096   directory
Device: 70h/112d        Inode: 2098236     Links: 3
Access: (0755/drwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2020-03-05 17:10:22.000000000
Modify: 1980-01-01 00:00:01.000000000
Change: 2020-03-05 17:10:22.000000000
simonjjones commented 4 years ago

In trying to replicate the behavioral issue of:

if a new version of a builder is created containing some of the same buildpacks as a previous builder image, those buildpack layers should be reusable

I have been unable to validate that timestamps are currently preventing this from happening in the create-builder case. @ekcasey perhaps you could shed some light on a reproducable failure scenario?

Here are the steps I took attempting to reproduce this using the samples repo:

# Clean docker images
$ docker rm -f $(docker ps -aq) &>/dev/null; docker system prune --all -f && docker volume prune -f && docker images && docker volume ls && docker ps

# Build stacks
samples $ stacks/build-stack.sh stacks/alpine
...
STACK BUILT!

Stack ID: io.buildpacks.samples.stacks.alpine
Images:
    cnbs/sample-stack-base:alpine
    cnbs/sample-stack-build:alpine
    cnbs/sample-stack-run:alpine

# Initial local timestamp for buildpack file
samples $ stat -x buildpacks/kotlin-gradle/bin/build
  File: "buildpacks/kotlin-gradle/bin/build"
  Size: 2646         FileType: Regular File
  Mode: (0755/-rwxr-xr-x)         Uid: (  501/   simon)  Gid: (   20/   staff)
Device: 1,4   Inode: 7993890    Links: 1
Access: Fri Mar  6 11:33:06 2020
Modify: Fri Mar  6 11:33:06 2020
Change: Fri Mar  6 11:33:06 2020

# Create builder for first time
samples $ pack create-builder cnbs/sample-builder:alpine --builder-config builders/alpine/builder.toml
alpine: Pulling from cnbs/sample-stack-build
4167d3e14976: Already exists
3314530d38b7: Pull complete
0731e2ac902d: Pull complete
2ccfb15e7e1b: Pull complete
6deb1b558de7: Pull complete
Digest: sha256:c1dbe1c23978320f7f0e31f12c636e08c7e77db1a997685da88495f457461691
Status: Downloaded newer image for cnbs/sample-stack-build:alpine
hello-universe: Pulling from cnbs/sample-package
a02b72d7691d: Pull complete
8ea996e71670: Pull complete
a76da803ecb4: Pull complete
Digest: sha256:0c06317729feffd0be9c70dba4b66e3f02cdbc81959ecbff674eae00520814bd
Status: Downloaded newer image for cnbs/sample-package:hello-universe
Successfully created builder image cnbs/sample-builder:alpine
Tip: Run pack build <image-name> --builder cnbs/sample-builder:alpine to use this builder

# Check timestamps for buildpack content in buildpack image, Access & Modify zeroed, change time consistent with create-builder command call
$ docker run -it cnbs/sample-builder:alpine stat /cnb/buildpacks/io.buildpacks.samples.kotlin-gradle/0.0.1/bin/build
  File: /cnb/buildpacks/io.buildpacks.samples.kotlin-gradle/0.0.1/bin/build
  Size: 2646            Blocks: 8          IO Block: 4096   regular file
Device: 70h/112d        Inode: 264381      Links: 1
Access: (0755/-rwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 1980-01-01 00:00:01.000000000
Modify: 1980-01-01 00:00:01.000000000
Change: 2020-03-06 18:52:32.000000000

# Store layer data for comparison
$ docker inspect cnbs/sample-builder:alpine | jq -r '.[0].RootFS.Layers' > /tmp/builder-layers.orig

# Update all timestamps for a buildpack file, without changing any content
samples $ touch buildpacks/kotlin-gradle/bin/build
samples $ stat -x buildpacks/kotlin-gradle/bin/build
  File: "buildpacks/kotlin-gradle/bin/build"
  Size: 2646         FileType: Regular File
  Mode: (0755/-rwxr-xr-x)         Uid: (  501/   simon)  Gid: (   20/   staff)
Device: 1,4   Inode: 7993890    Links: 1
Access: Fri Mar  6 13:56:13 2020
Modify: Fri Mar  6 13:56:13 2020
Change: Fri Mar  6 13:56:13 2020

# Recreate builder
samples $ pack create-builder cnbs/sample-builder:alpine --builder-config builders/alpine/builder.toml
alpine: Pulling from cnbs/sample-stack-build
Digest: sha256:c1dbe1c23978320f7f0e31f12c636e08c7e77db1a997685da88495f457461691
Status: Image is up to date for cnbs/sample-stack-build:alpine
hello-universe: Pulling from cnbs/sample-package
Digest: sha256:0c06317729feffd0be9c70dba4b66e3f02cdbc81959ecbff674eae00520814bd
Status: Image is up to date for cnbs/sample-package:hello-universe
Successfully created builder image cnbs/sample-builder:alpine
Tip: Run pack build <image-name> --builder cnbs/sample-builder:alpine to use this builder

# Check new timestamps for file with timestamps modified locally - note the Access & Modify times are already zeroed and the Change time is consistent with the original create-builder time
$ docker run -it cnbs/sample-builder:alpine stat /cnb/buildpacks/io.buildpacks.samples.kotlin-gradle/0.0.1/bin/build
  File: /cnb/buildpacks/io.buildpacks.samples.kotlin-gradle/0.0.1/bin/build
  Size: 2646            Blocks: 8          IO Block: 4096   regular file
Device: 70h/112d        Inode: 264381      Links: 1
Access: (0755/-rwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 1980-01-01 00:00:01.000000000
Modify: 1980-01-01 00:00:01.000000000
Change: 2020-03-06 18:52:32.000000000

# Check for changes in docker RootFS Layers
$ docker inspect cnbs/sample-builder:alpine | jq -r '.[0].RootFS.Layers' > /tmp/builder-layers.new
$ diff /tmp/builder-layers.orig /tmp/builder-layers.new
$ echo $?
0
natalieparellano commented 4 years ago

To investigate: does the behavior demonstrated above replicate on a linux host?

ekcasey commented 4 years ago

I believe @nebhale reportedly ran into this in the wild. Maybe this is out of date? Or Ben can describe the situation where he has observed the issue?

I just tried it quickly and the layer blobs seemed to have reproducible digests. However, the order of the buildpack layers on the builder wasn't reproducible, which is a slightly different (but important) problem affecting full reproducibility.

natalieparellano commented 4 years ago

It's worth noting that the "change" time for /cnb/buildpacks/io.buildpacks.samples.kotlin-gradle/0.0.1/bin/build was the same in both builders (thereby it's unsurprising that the layer ids are the same). Does this imply that the second call to create-builder re-used the existing buildpack layer? Looking in the pack code to see how this might occur...

ekcasey commented 4 years ago

My guess is that the change time is not encoded in the archive at all and instead gets set when the docker daemon extracts the layer (and that extracted layer overlay is reused between containers).

zmackie commented 4 years ago

WG Concensus was: verify that these commands don't cause layers to get reshipped across the network, which is ultimately the aim. This issue may be already done, but we need to double check.

simonjjones commented 4 years ago

Waiting on https://github.com/buildpacks/pack/issues/606

schneems commented 2 weeks ago

606 was marked as a blocker but is now closed. Do people still actively want this and/or are able to work on it? What are the current blockers or next steps?

natalieparellano commented 1 week ago

I think the thinking is that we probably already do this. I think the needed thing is to verify that that's true.