Fix the Dockerfile format once and for all to allow control over layers

By default Dockerfiles do something that can be compared to auto commit for every command creating a layer.

I believe the purpose of this is for caching where images have a common differential between layers. I assume it's using content hashing to deduplicate layers similarly to as git does. This great when it works but it's opportunistic and naive to the point that it can have the opposite effect off what is intended.

This causes all lot of problems. Not only do you have layers that cache miss and have no reason to really exist, more complicated history, etc. You also have to do complex things to stop ephemeral data from polluting the image not only from a distribution and unrolling perspective but in relation to storage on the device used for building the images.

This isn't really something where an automatic default is going to be useful. Ultimately the author of a given Dockerfile is going to best know how things should be grouped into layers. You see this with people merging multiple commands using &&.

There has been a squash command but it seems to be absent from recent versions of docker. This also has the problem the other way (default is like having auto commit, squash is like making everything a single commit).

There is the claim that multi stage builds solve the layers problem but I don't see how it does that in any kind of a useful or convenient way. It solves some specific sets of problems but has ignored others. It seems to be syntactic sugar for something I personally don't have a great problem with, making intermediate images, packaging, exporting, etc. It's a small convenience sometimes but also introduces new problems.

In the immediate sense, it's not very intuitive. When I first looked at the configuration it wasn't clear what I was seeing. At first it looked like multiple inheritance perhaps applying the last stage of one build onto another. When you figure out what it actually does which up to the last FROM (which is implicit and unclear a bit like vhost ordering in apache httpd) create lots of subimages where you can ADD or COPY things from their final layer then you realise it's quite limited.

If I run:

FROM base:latest AS self
ADD ./1G /1G
RUN rm /1G
FROM self

Then I'm still going to see 1GB of data on my file system in the layers. Please imagine that there's a bit more to this example and that there's a good reason for having to do something like this (especially when using public images you're sometimes constrained by the way they have done things).

It would make a lot more sense to give people the flexibility they need with:

START LAYER
    ADD a
    RUN x
    DEL a
END LAYER

It's a pretty arbitrary measure to decide that a single command should be atomic. Especially when you see people accomplishing the same with thing such as 'RUN a && b && c' or why 'ADD a;ADD b' is two layers but 'ADD ab.tgz' is one layer. Note that with this you can still have a sublayer cache but with more control granted to the user such as to easily purge the cache.

Automating things like this can always leave you in a situation with an edge case and now the user cannot do anything because of the lack of a workaround (one example of automation is to remove everything from the layers not in the final layer, however then you still might have a lot of layers if they don't resolve to empty (additionally, two images with identical initial layers might not have the same levels of file presence in subsequent differing layers).

docker / for-linux

Fix the Dockerfile format once and for all to allow control over layers #154