Squash docker images to have less layers and size

echoix commented 1 month ago

Is it possible to not squash all layers at once? Pulling performance will be worse, as it would need to wait to complete the complete image before starting to verify, then extract. Any change will need to have all contents changed.

It's possible to change some dockerfiles to have multiple layer operations combined together by using multistage builds and copying the result of multiple layers as one layer. I don't know if there are other ways to not end up with that big squash.

echoix commented 1 month ago

And size-wise, it's true when you waste space by deleting in another layer, or changing the same file later. Last time I checked, we were pretty good at that, and we weren't wasting much. Pretty much some tiny files that were touched by tools that we couldn't remove.

nvuillam commented 1 month ago

@echoix i don't have other ideas when I see simple docker pulls fail, like this one :/

https://github.com/oxsecurity/megalinter/actions/runs/11427413336/job/31791515470

github-actions[bot] commented 1 month ago

🦙 MegaLinter status: ⚠️ WARNING

Descriptor	Linter	Files	Fixed	Errors	Elapsed time
✅ API	spectral	1		0	1.15s
⚠️ BASH	bash-exec	6		1	0.02s
✅ BASH	shellcheck	6		0	0.21s
✅ BASH	shfmt	6	0	0	0.8s
✅ COPYPASTE	jscpd	yes		no	3.77s
✅ DOCKERFILE	hadolint	128		0	14.39s
✅ JSON	jsonlint	20		0	0.19s
✅ JSON	v8r	22		0	29.89s
⚠️ MARKDOWN	markdownlint	266	0	297	34.29s
✅ MARKDOWN	markdown-table-formatter	266	0	0	154.22s
⚠️ PYTHON	bandit	212		66	3.93s
✅ PYTHON	black	212	0	0	7.23s
✅ PYTHON	flake8	212		0	2.96s
✅ PYTHON	isort	212	0	0	1.61s
✅ PYTHON	mypy	212		0	23.22s
✅ PYTHON	pylint	212		0	33.37s
✅ PYTHON	ruff	212	0	0	0.86s
✅ REPOSITORY	checkov	yes		no	49.14s
✅ REPOSITORY	git_diff	yes		no	0.74s
⚠️ REPOSITORY	grype	yes		24	14.65s
✅ REPOSITORY	secretlint	yes		no	16.14s
✅ REPOSITORY	trivy	yes		no	24.18s
✅ REPOSITORY	trivy-sbom	yes		no	0.56s
⚠️ REPOSITORY	trufflehog	yes		1	13.5s
✅ SPELL	cspell	713		0	12.34s
⚠️ SPELL	lychee	348		11	58.43s
✅ XML	xmllint	3	0	0	0.85s
✅ YAML	prettier	160	0	0	6.05s
✅ YAML	v8r	102		0	195.5s
✅ YAML	yamllint	161		0	2.13s

See detailed report in MegaLinter reports

_MegaLinter is graciously provided by _

echoix commented 1 month ago

@echoix i don't have other ideas when I see simple docker pulls fail, like this one :/

https://github.com/oxsecurity/megalinter/actions/runs/11427413336/job/31791515470

I tried it locally (on the gitpod instance I was using, not for megalinter), and get this too. So there's something else going on.

echoix commented 1 month ago

https://stackoverflow.com/questions/47272611/docker-max-depth-exceeded?

nvuillam commented 1 month ago

On a new action I don't see anything I could call to avoid such error, that's why i try the squash :/

echoix commented 1 month ago

Ah see that the overlay2 storage engine has a limit of 125 layers: https://github.com/docker/for-linux/issues/414#issuecomment-438861366

But how we end up to 125 is probably there the problem. A recursive something propably happened in the chain somewhere, that is adding layers upon layers

echoix commented 1 month ago

It seems that containerd backing store can handle these. So it is a possibility to use that for debugging what is causing so many layers for a worker image

oxsecurity / megalinter

Squash docker images to have less layers and size #4170

🦙 MegaLinter status: ⚠️ WARNING