containers / buildah

A tool that facilitates building OCI images.
https://buildah.io
Apache License 2.0
7.32k stars 772 forks source link

podman build: rootless build takes up a lot of disk space #1040

Closed dustymabe closed 5 years ago

dustymabe commented 6 years ago

podman-0.9.4-1.dev.gitaf791f3.fc30.x86_64 in Fedora rawhide.

Description I started a rawhide VM and the first operation I did on it was podman build. I am seeing podman build take up a lot of space.

[vagrant@vanilla-rawhide-atomic coreos-assembler]$ podman images
REPOSITORY                          TAG      IMAGE ID       CREATED         SIZE
localhost/ca                        latest   30eacefcd9ea   3 minutes ago   2.73GB
registry.fedoraproject.org/fedora   28       f6f9d2ff8a74   9 days ago      264MB
$ sudo du -sh /home/vagrant/.local/share/containers/storage/
20G     /home/vagrant/.local/share/containers/storage/

podman images --all shows me more:

[vagrant@vanilla-rawhide-atomic coreos-assembler]$ podman images --all
REPOSITORY                          TAG      IMAGE ID       CREATED          SIZE
localhost/ca                        latest   30eacefcd9ea   11 minutes ago   2.73GB
<none>                              <none>   d5a351eaf496   13 minutes ago   2.73GB
<none>                              <none>   27eaa111740d   15 minutes ago   2.73GB
<none>                              <none>   034aa92c8949   17 minutes ago   2.73GB
<none>                              <none>   361a1e216d53   19 minutes ago   2.73GB
<none>                              <none>   765c059be68a   21 minutes ago   2.73GB
<none>                              <none>   847b19169eb3   24 minutes ago   1.93GB
<none>                              <none>   905da9620bd1   27 minutes ago   1.88GB
<none>                              <none>   aec3012fa3ef   33 minutes ago   264MB
<none>                              <none>   ee1782e0b816   33 minutes ago   264MB
<none>                              <none>   b9270c07d6fd   33 minutes ago   264MB
registry.fedoraproject.org/fedora   28       f6f9d2ff8a74   9 days ago       264MB

Before the build was done the used space ballooned up to about 37G before coming back down to 20G.

It's possible this is not a bug but it seems like there is something up here. Maybe the layers of the build aren't properly sharing disk space ?

Steps to reproduce the issue:

  1. rootless podman build -t ca with Dockerfile/context from: https://github.com/dustymabe/coreos-assembler/tree/7cd95023aa0d7f6ccee2e57f6006e8e9978313f8

Describe the results you received:

a lot of disk space to get used

Describe the results you expected:

not as much disk space to get used.

Output of podman version if reporting a podman build issue:

$ podman version
Version:       0.9.4-dev
Go Version:    go1.11
OS/Arch:       linux/amd64
TomSweeneyRedHat commented 6 years ago

@dustymabe can you try the same command but use the --layers param too? Let us know how that compares.

TomSweeneyRedHat commented 6 years ago
podman build --layers -t ca
dustymabe commented 6 years ago

@dustymabe can you try the same command but use the --layers param too? Let us know how that compares.

it was my understanding that layers was now the default since v0.9.1. see the release notes

TomSweeneyRedHat commented 6 years ago

@dustymabe ack, I'd forgotten that the defaulted that to true in Podman, it's still false down in Buildah.

dustymabe commented 6 years ago

so this comparison isn't quite apples to apples.. but I went to an older version of podman (podman-0.8.4-1.git9f9b8cf.fc29.x86_64) and switched to running as root and after the container build is done it only uses 6.2G on the filesystem.

I guess what I'm trying to say is that there is at least proof less space can be used with the same content set.

TomSweeneyRedHat commented 6 years ago

@dustymabe thanks for the data points. I think Podman 8.4 had layers set as false in that release. Just curious, have you built the container as root and rootless? Are you seeing the large size only in rootless mode or are the container/layer sizes comparable between the two modes?

TomSweeneyRedHat commented 6 years ago

@umohnani8 PTAL

dustymabe commented 6 years ago

I think Podman 8.4 had layers set as false in that release.

ahh right.

Just curious, have you built the container as root and rootless?

I have not. in that particular VM i was close to running out of disk space so I put it on ice while we investigate this.

bowlofeggs commented 5 years ago

I can confirm this issue with podman-0.10.1-1.gite4a1553.fc28.x86_64. I built a simple Vagrantfile like this:

$ podman build --pull -t localhost/bodhi-ci/f28 -f Dockerfile-f28 .

While that was building, I watched du on ~/.local/path/to/containers (sorry, can't remember the exact path) and saw it grow hugely. The line in my Dockerfile that says RUN ln -s /usr/bin/true /usr/bin/pungi-koji caused it to grow from 3.6 GB to 5.4 GB. Ultimately, the job used 37 GB at peak, to build a container that just had a few hundred MB of RPMs in it. When the job was done, the post build clean up got it down to 18 GB, which is still huge for a few hundred MB of RPMs and a base image. The build also took 17 minutes.

To counter, I tried the same command with sudo on the same machine and saw a vast difference. Of course, this time I watch du on /var/lib/containers. I saw it peak at about 11 GB which is still a lot, but when the job was finished it used only 1.4 GB which seems reasonable enough for a base layer and a few hundred MB of RPMs. This build took 5 minutes.

42wim commented 5 years ago

I've got something similar using buildah as user, it's because of the overlay versus vfs storage driver. Overlay can't be used as a normal user so I guess it uses vfs driver by default, which copies every layer and takes a lot of space.

Afaik https://github.com/containers/fuse-overlayfs is a WIP to fix this issue

rhatdan commented 5 years ago

@giuseppe PTAL

giuseppe commented 5 years ago

I think this all depends on vfs being used. Each layer contains all the files of the parent, so even a simple image based on Fedora will end up duplicating the fedora base image for each layer.

With a recent kernel (4.18) it is possible to use fuse-overlayfs, setting these lines in ~/.config/containers/storage.conf:

[storage]
driver = "overlay"

[storage.options]
mount_program = "/usr/bin/fuse-overlayfs"

it is honored by "podman build", it is not yet honored when using directly buildah: https://github.com/containers/buildah/issues/1113

dustymabe commented 5 years ago

since a 4.18 kernel is in f2{7,8,9} can we make that the default in Fedora now ? I don't think anyone wants to use up 40G of disk for a simple build.

I'd also consider just disabling rootless builds in epel7 and requiring someone to change a setting in order to get the vfs, but lots of disk space, behavior as I think you'll get more bug reports than you want.

rhatdan commented 5 years ago

@giuseppe https://github.com/containers/libpod/pull/1693 to get buildah to honor storage settings.

dustymabe commented 5 years ago

@giuseppe With a recent kernel (4.18) it is possible to use fuse-overlayfs, setting these lines in ~/.config/containers/storage.conf

I'm hitting an issue with your suggestion:

[dustymabe@dhcp137-98 weechat]$ rpm -q kernel podman
kernel-4.18.16-300.fc29.x86_64
podman-0.10.1-1.gite4a1553.fc29.x86_64
[dustymabe@dhcp137-98 weechat]$ cat ~/.config/containers/storage.conf 
[storage]
driver = "overlay"

[storage.options]
mount_program = "/usr/bin/fuse-overlayfs"
[dustymabe@dhcp137-98 weechat]$ ls /usr/bin/fuse-overlayfs 
/usr/bin/fuse-overlayfs
[dustymabe@dhcp137-98 weechat]$ 
[dustymabe@dhcp137-98 weechat]$ bash build.sh 
+ podman build -t weechat --layers ./
could not get runtime: database graph driver name vfs does not match our graph driver name overlay: database configuration mismatch
rhatdan commented 5 years ago

I think this is a bug, you might need to while out your vfs storage. I am not sure why containers/storage complains about this, I think it should just support both kinds of storage at the same time.

I think it is just noticing that you have vfs storage and don't have any overlay, so it is complaining.

dustymabe commented 5 years ago

I think this is a bug

should I open a separate bug for this or track it here? @giuseppe do you mind weighing in ?

giuseppe commented 5 years ago

should I open a separate bug for this or track it here? @giuseppe do you mind weighing in ?

I think it is a separate issue as it happens also with root containers. I agree with @rhatdan we should not block to use a different storage and require the user to first wipe out the current one.

giuseppe commented 5 years ago

Looking at the code: probably the reason why it is done this way is that having containers configured from different storages will be difficult and confusing. @mheon should we rather inform the user in this case to wipe out the existing storage before the configuration can be changed?

mheon commented 5 years ago

@giuseppe The thought here was that, if you changed something significant about c/storage that would render all your existing containers useless, you should probably just blow away the Podman DB at the same time and start from scratch. Is it reasonable to ask people to do that here, or can we salvage containers/images if it's just the storage driver that changes?

rhatdan commented 5 years ago

The question is do we want to support two different drivers at the same time. Could I want to run some containers on Overlay, but others on devicemapper or vfs?

In the future, once we have support for Devicemapper, I might want to run a KATA container with devicemapper while another is running with Overlay.

dustymabe commented 5 years ago

So we have https://github.com/containers/libpod/pull/1726 which makes overlay the default for rootless podman. Should we do the same thing for buildah and close this ticket out?

Also it might be nice to spit out a warning message if someone does end up using vfs that vfs takes up a lot of disk space.

giuseppe commented 5 years ago

buildah will inherit this change once we re-vendor libpod. I've done some tests with fuse-overlayfs and the built images "look" correctly.

Although, fuse-overlayfs is a lot of additional complexity and we need to be careful when we change the default for buildah. Especially if these images end up pushed to a remote registry. I'd say vfs is still going to be a safer (but more expensive) bet.

rhatdan commented 5 years ago

@giuseppe Is this in buildah yet?

giuseppe commented 5 years ago

both buildah and podman default to fuse-overlayfs when possible, so closing the issue