wagoodman / dive

A tool for exploring each layer in a docker image
MIT License
47.48k stars 1.79k forks source link

Dive flags directory copied in multi-stage build as wasted bytes #316

Closed artis3n closed 4 years ago

artis3n commented 4 years ago

When I build a container in a single image that compiles an app into /app, dive rightfully considers /app important and doesn't list that content as inefficient or wastedBytes.

However, when I move this Dockerfile into a multi-stage build and COPY --from=0 /app /app, dive considers the /app directory as wasted bytes. Is there a way to explicitly tell dive to consider this directory important?

Reproduce:

git clone https://github.com/artis3n/pgmodeler-container.git
git submodule init && git submodule update --remote
# main branch with a single image in the Dockerfile
git checkout a088ea69ad3c806088418889faf6c8dac6bd107a
# Build the container, will take about 25 minutes
docker build . -t artis3n/pgmodeler:test
# Test the image
dive artis3n/pgmodeler:test
# Observe /app is not considered wasted bytes

# Now, try with the multi-stage build

# commit with the multi-stage build
git checkout 8dc5f5005692adf287ea624d4eae7805ea60ab72
# Build the container, will take about 25 minutes (quick if the first layer is cached from the previous build)
docker build . -t artis3n/pgmodeler:test
# Test the image
dive artis3n/pgmodeler:test
# Observe /app is wasted bytes

Single image: image

Multi-stage: image

wagoodman commented 4 years ago

Thanks for the instructions! There is not a way to ignore paths, and previous stages in the multi stage build should not be included in the analysis.

In the specific case here the duplicate path is not due to the COPY instruction, but instead the chown command is causing the /app directory contents to be copied up to the current layer.

If you modify:

COPY --from=compiler /app /app

# Set up non-root user
RUN groupadd -g 1000 modeler \
    && useradd -m -u 1000 -g modeler modeler \
    && chown -R modeler:modeler /app

To:

# Set up non-root user
RUN groupadd -g 1000 modeler \
    && useradd -m -u 1000 -g modeler modeler

COPY --chown=modeler:modeler --from=compiler /app /app

... you shouldn't see /app as a duplicate path and get 15-ish MB back :) .

artis3n commented 4 years ago

:O

TIL about --chown. That's great, thanks.