ministryofjustice / cloud-platform

Documentation on the MoJ cloud platform
MIT License
86 stars 44 forks source link

Remove all alpine linux images in favour of debian:bookworm-slim #5543

Open jaskaransarkaria opened 5 months ago

jaskaransarkaria commented 5 months ago

Background

Alpine linux introduces lots of issues which can cause workflow to break, we have already been stung by this and have had builds randomly break in our cli and recently when developing a jq command we found it would work a way on all non-alpine os and another way on alpine os

By using Alpine, you're getting "free" chaos engineering for you cluster.

Some of it stems from how musl (and therefore also Alpine) handles DNS (it's always DNS), more specifically, musl (by design) doesn't support DNS-over-TCP. Usually, you would not notice this difference, because most of the time a single UDP packet (512 bytes) is enough to resolve hostnames... until it isn't enough and your application (running on Kubernetes) that previously worked completely fine for months suddenly starts throwing "Unknown Host" exceptions for one particular (very critical) hostname. The worst part is that this can manifest randomly, anytime when some external network change causes the resolution of some particular domain to require more than the 512 bytes available in single UDP packet.

ref

👇🏽 Dockerfiles using alpine linux

jaskaran 14:35:42 repos →  find cloud-platform-* -type f -name \* | xargs -n 1 | xargs -I % grep -l alpine %
cloud-platform-environments/cmd/delete-oldsnapshots/Dockerfile
cloud-platform-environments/cmd/compare-namespace/Dockerfile
cloud-platform-environments/cmd/check-terraform-modules-are-latest/Dockerfile
cloud-platform-environments/cmd/push-terraform-module-version/Dockerfile
cloud-platform-go-get-module/Dockerfile
cloud-platform-hammer-bot/slackbot/Dockerfile
cloud-platform-hammer-bot/Dockerfile
cloud-platform-how-out-of-date-are-we/Dockerfile_go
cloud-platform-how-out-of-date-are-we/Dockerfile
cloud-platform-how-out-of-date-are-we/dashboard-reporter/Dockerfile
cloud-platform-how-out-of-date-are-we/reports/documentation/Dockerfile
cloud-platform-how-out-of-date-are-we/reports/helm-releases/Dockerfile
cloud-platform-how-out-of-date-are-we/reports/namespace-usage/Dockerfile
cloud-platform-how-out-of-date-are-we/reports/terraform-modules/Dockerfile
cloud-platform-how-out-of-date-are-we/reports/namespace-costs/Dockerfile
cloud-platform-how-out-of-date-are-we/reports/orphaned-aws-resources/Dockerfile
cloud-platform-how-out-of-date-are-we/reports/orphaned-terraform-statefiles/Dockerfile
cloud-platform-infrastructure/test/docker/curl-jq.Dockerfile
cloud-platform-kuberhealthy-checks/cmd/namespace-check/Dockerfile
cloud-platform-label-pods/Dockerfile
cloud-platform-tools-image/Dockerfile

Definition of done

Reference

How to write good user stories

Matt-Alinosn commented 4 months ago

Still to discuss the way forward. Marked as blocked for the moment.