Open reavessm opened 3 years ago
As you can see we stick with the official tarballs and they are meant to offer an environment from which you can build and install your Gentoo Linux, I guess that's why.
AFAIK other distros slim down their Docker images by removing cruft such as unnecessary packages, man pages, etc. We could definitely apply something similar here, but it would probably require guidance from a Gentoo Developer, i.e., someone with a good understanding of the stage3
tarball structure and what is required for a working container.
You can delete unnecessary packages, like cmake, that the software doesn't need to run. Force unmerge with:
emerge -W --rage-clean <foo>
Deleting directories like /var/db is okay as users aren't expected to enter there.
rm -rf /var/db # also /var/cache/distfiles /var/tmp/portage /usr/share/{doc,man} /var/cache/binpkgs
For more details, check https://wiki.gentoo.org/wiki/Knowledge_Base:Freeing_disk_space.
You can delete unnecessary packages, like cmake, that the software doesn't need to run. Force unmerge with:
We shouldn't remove build dependencies, because it's just going to make it harder to install things downstream.
Deleting directories like /var/db is okay as users aren't expected to enter there.
No, it isn't, because that breaks your Gentoo installation by wiping out the Portage state stored in /var/db/pkg
. I think it might make sense to remove the documentation and manpages, since it's approaching 1/3rd of the total image size. As for the other directories, they're empty:
$ podman run -it gentoo/stage3 find /var/cache/distfiles /var/tmp/portage /var/cache/binpkgs
/var/cache/distfiles
find: '/var/tmp/portage': No such file or directory
/var/cache/binpkgs
We shouldn't remove build dependencies, because it's just going to make it harder to install things downstream.
In a regular Gentoo system, we shouldn't. In containers, it doesn't make much sense to keep build dependencies (excluding run dependencies like Ruby) if you've already compiled the software. Obviously, this is to achieve a tiny Docker image.
RUN emerge foo && emerge -W --rage-clean foo-dependency
No, it isn't, because that breaks your Gentoo installation by wiping out the Portage state stored in /var/db/pkg
Yeah, my bad, the fat thing is /var/db/repos/gentoo. In a remote case where you want to execute a shell session for the container and want to restore the ebuild repository, you can simply do:
emaint sync -r gentoo
At least, that's what works for me. Regarding the "empty" directories, it depends on the thing you emerge to the system; a fresh container doesn't have many things to delete.
The images, gentoo/portage
and gentoo/stage3
are effectively just docker versions of the tarballs you can download from normal Gentoo distribution channels. These are just docker images of the same, effectively a way to distribute the equivalent thing via docker registries.
They serve as being the same building blocks for making a gentoo distro in a docker container, as the tarballs do for a virtual machine or physical host.
For the end image that you want to run, say a web server like nginx, ideally it would be a single nginx
binary, distroless, no gcc
, no emerge
, not bash
, etc.
For that you can either use a multistage build similar Arzano's page linked https://github.com/gentoo/gentoo-docker-images/issues/107#issuecomment-1821217314.
Or use Kubler:
Kubler is a build tool that uses Gentoo to build packages, and creates a docker image with just the packages - the final image does not have portage, emerge, the rest of the file system - it just has the packages and whatever you explicitly created.
If you want an official slimmed gentoo docker image, that still has emerge
etc, but doesn't have the manpages etc, that should be a separate image like gentoo/gentoo
(or something like that).
I think the gentoo/portage and gentoo/stage3 images should map to upstream tarballs. If the tarballs have the manpages, the docker images should too.
So, if the tarballs should drop the manpages, and then the docker images won't have the manpages.
I don't know if this is the right place because this isn't really an issue, but more of a question.
Why do the gentoo containers have such a large image size? Currently, latest on am64 is 287.76 MB, while the Fedora image is 58.39 MB and Ubuntu is down to 22.95 MB. I'm seeing that
/usr/libexec/gcc
is taking 111 MB, and I understand that it wouldn't be Gentoo without GCC, but is there any other place to trim some fat?