open-telemetry / opentelemetry-collector-releases

OpenTelemetry Collector Official Releases
https://opentelemetry.io
Apache License 2.0
252 stars 162 forks source link

Pack binary with `upx` #474

Open JamieMagee opened 9 months ago

JamieMagee commented 9 months ago

Component(s)

No response

Describe the issue you're reporting

The opentelemetry-collector-contrib container image is already well optimized by using FROM scratch^1. But by using upx to compress the otelcontribcol binary before copying it to the final container image would allow us to save even more.

Building locally with make docker-otelcontribcol I get the following container image:

$ docker inspect -f "{{.Size}}" docker.io/library/otelcontribcol | numfmt --to=si
341M

Compressing the otelcontribcol binary with upx --best as part of the build I get:

$ docker inspect -f "{{.Size}}" docker.io/library/otelcontribcol | numfmt --to=si
127M

That's a decrease of 214MB or 63%. Looking at the total number of container image downloads of the 0.93.0 tag, which has ~60k downloads, that equates to ~13TB overall.

The main downside is that this increases the build time drastically, so this could only really be used for tagged version builds.

mx-psi commented 9 months ago

Hey, transferring this to our releases repository, where the Docker images are actually built :) How does upx interact with security scanners? I have heard stories of upx leading to antivirus flagging some years ago, I wonder if that still holds today

JamieMagee commented 9 months ago

I ran a scan with trivy for the currently published image, and it returned no results. So it puts the image in no worse position than it is now:

$ trivy image ghcr.io/open-telemetry/opentelemetry-collector-releases/opentelemetry-collector-contrib:0.93.0
2024-02-09T11:32:22.850-0800    INFO    Need to update DB
2024-02-09T11:32:22.850-0800    INFO    DB Repository: ghcr.io/aquasecurity/trivy-db
2024-02-09T11:32:22.850-0800    INFO    Downloading DB...
42.80 MiB / 42.80 MiB [---------------------------------------------------------------------] 100.00% 32.44 MiB p/s 1.5s
2024-02-09T11:32:25.262-0800    INFO    Vulnerability scanning is enabled
2024-02-09T11:32:25.262-0800    INFO    Secret scanning is enabled
2024-02-09T11:32:25.262-0800    INFO    If your scanning is slow, please try '--scanners vuln' to disable secret scanning
2024-02-09T11:32:25.262-0800    INFO    Please see also https://aquasecurity.github.io/trivy/dev/docs/scanner/secret/#recommendation for faster secret detection
2024-02-09T11:32:28.117-0800    INFO    Number of language-specific files: 0

As for antivirus false positives, there is a pinned issue about them in the upx repo: https://github.com/upx/upx/issues/437. It is a risk, but 9 false positives over 3+ years seems relatively low.

mx-psi commented 9 months ago

@JamieMagee Your test shows that it does not introduce any false positives (that sounds unlikely), what I am wondering is if it introduces false negatives (effectively, if it obfuscates the binary in some way that makes trivy and friends not detect real issues).

cc @open-telemetry/sig-security-maintainers

codeboten commented 9 months ago

@JamieMagee @mx-psi thanks for the suggestion! I've added discussing this item to the security SIG agenda for this week

jpkrohling commented 9 months ago

We used to have this in the first versions we released using goreleaser, but it caused problems with the binaries for Darwin. I can't find the issue right now to have a reference, but if we do run upx on the binaries, we should make sure the final executables are tested before releasing them.