superlinear-ai / poetry-cookiecutter

🍪 Poetry Cookiecutter is a modern Cookiecutter template for scaffolding Python packages and apps
GNU Affero General Public License v3.0
252 stars 37 forks source link

fix: avoid exploding Docker image size #185

Closed sinopeus closed 1 year ago

sinopeus commented 1 year ago

TLDR: Passing --no-log-init to useradd prevents an issue where the Docker image size would potentially increase to hundreds of gigabytes when passed a "large" UID or GID. This is apparently a side effect of how useradd creates the user fail logs.

The issue is explained in more detail at https://github.com/docker/docs/issues/4754. The root cause is apparently combination of the following:

  1. useradd by default allocates space for the faillog and lastlog for "all" users: https://unix.stackexchange.com/q/529827. If you pass it a high UID, e.g. 414053617, it will reserve space for all those 414053617 user logs, which amounts to more than 260GB.
  2. The first bullet wouldn't be a problem if Docker would recognize the sparse file and compress it efficiently. However, there is an unresolved issue in the Go archive/tar package's (which Docker uses to package image layers) handling of sparse files:

    https://github.com/golang/go/issues/13548

    Eight years unresolved and counting!

Passing --no-log-init to useradd avoids allocating space for the faillog and lastlog and fixes the issue.

Finding out the root cause for this bug drove me loco. Reader, enjoy :-)

lsorber commented 1 year ago

Thanks for the PR @sinopeus! I haven't had the chance yet to dig into it but it certainly looks like an intricate issue. I'll need a bit to make sure I understand what's going on before I can review this PR properly.