green-code-initiative / ecoCode-challenge

Emboard in the hackhatons serie for improving ecoCode
3 stars 4 forks source link

[Hackathon 2024][Gadolinium][Docker] Use lightweight base image #112

Open rducasse opened 1 month ago

rducasse commented 1 month ago

Rule title

Use lightweight base image

Language and platform

Docker

Rule description

When creating Docker images, developers may start with base images that include unnecessary software components or libraries. This practice can lead to bloated images, increasing their size and resource consumption. For instance, using a base image like Ubuntu or CentOS, which come with a plethora of pre-installed packages, even if they are not required for the specific application, results in heavy Docker images. These images take longer to build, consume more storage space, and slow down deployment processes. Developers should instead opt for lightweight base images tailored to the specific needs of their applications. For example, instead of using a full-fledged Linux distribution as the base image, choosing a slim or alpine variant significantly reduces image size and resource overhead. By using minimalistic base images, developers optimize resource utilization, improve build and start times, conserve storage space...

Non-compliant:

FROM python:3.8
# Image size: 356MB

Compliant:

FROM python:3.8-alpine
# Image size: 17MB

Non-compliant:

FROM debian:12.5
…
# Only the last FROM command impacts the final image size
FROM node:18:
# Image size: 352MB

Compliant:

FROM debian:12.5
…
# Only the last FROM command impacts the final image size
FROM node:18-slim:
# Image size: 65MB

Non-compliant:

FROM azul/zulu-openjdk:21
# Image size: 190MB

Compliant:

FROM azul/zulu-openjdk-distroless:21
# Image size: 153MB

Rule short description

Avoid using heavy Docker base images. Use minimal, application-specific base images, like alpine, or specialized, resource-optimized versions (e.g. python:3.8-alpine instead of python:3.8).

Rule justification

Why it matters:

Do these versions frequently exist ?:

Official documentation : https://docs.docker.com/develop/develop-images/guidelines/

Expert article : https://www.fullstack.com/labs/resources/blog/small-is-beautiful-how-container-size-impacts-deployment-and-resource-usage Measurement (from scientific article https://assets-eu.researchsquare.com/files/rs-3276965/v1_covered_8dc408b5-6997-486a-89c8-a5c66fddf60e.pdf?c=1693976849) :

Docker Images with Original Image size Obtained Image size Image size Reduced
Being lazy couch potatoes 1.47GB 1.47GB 0%
Optimizing the parent image 1.47GB 642MB 56.32%

Severity / Remediation Cost

Severity: Major (Huge differences in the size of the images, e.g. 100Mo instead of 1Go, see impacts in previous section)

Cost: Easy (only the image version to change, so no need to understand the logic but potential impacts)

Implementation principle

To detect the non use of a lightweight docker image, we will have to find the last use of the FROM command in the Dockerfile (the ones before does not impact the final size of the image). If the image used contains alpine or slim in the tag (version), or contains distroless in the repository (image name), we can consider the image lightweight, otherwise we can consider the rule broken.

Possible false positive: corporate custom images that do not follow naming conventions (alpine, slim, distroless).