DiUS / pact_broker-docker

'Dockerised' pact broker
http://pact.io
MIT License
76 stars 102 forks source link

Check if its possible to reduce the docker image size of pact broker ? #52

Closed gammabowl closed 5 years ago

gammabowl commented 6 years ago

The postgres docker image used with pact broker is just 270MB. So relative to that pact broker seems to be on the heavier side. Was wondering if its possible to reduce the docker image size for pact broker ?

bethesque commented 6 years ago

That does seem ridiculous! There's not that much going on in it. I wonder if it's all the extra rubies that come installed with phusion-passenger.

bethesque commented 6 years ago

I'm assuming the 819MB you quote above is the uncompressed size.

The compressed size of the ruby passenger base image is 244MB. The compressed size of the pact-broker image is 294MB. So the bulk of it comes from the base image. I'm not a docker expert, so I'm happy to take a PR if you could do some research into what might be removed from the passenger base image.

bethesque commented 6 years ago

This might contain some useful info: https://semaphoreci.com/blog/2016/12/13/lightweight-docker-images-in-5-steps.html

gammabowl commented 6 years ago

@bethesque - I am not aware of the ruby world, but I have fairly good experience with docker, I could give it a shot. if you have any links which I could use, please do let me know. I can take this up if its fine with you guys ?

bethesque commented 6 years ago

That would be awesome, thanks! I

MeadowValley commented 6 years ago

I'm sure you guys, especially shashidesai, already know this, but in most cases you should prefer the COPY command over ADD. https://www.ctl.io/developers/blog/post/dockerfile-add-vs-copy/

bethesque commented 6 years ago

I'm a total docker rookie. Very happy to accept PRs from people who know more than I do.

k-ong commented 6 years ago

the pact-broker is built from the phusion passenger container: phusion/passenger-ruby24:0.9.26 is at 659MB uncompressed.

It definitely seems to be on the heavier side, but comparing it to the postgres image is slightly misleading.

gammabowl commented 6 years ago

Yep, I get that comparing it to postgres is slightly misleading but for a small app like pact broker I thought 819MB was too large an image. I was trying to experiment by not using the phusion/passenger image (which is already heavier), let me know what you think about that ? If it doesn't make sense, then we could skip and close this issue ?

MeadowValley commented 6 years ago

This indeed makes sense. I would like to have a lightweight docker image, too. But I think it does not make sense to compare the phusion-image to an postgres-image. They're not comparable.

The phusion is based on ubuntu, which isn't kinda heavy but something seems to be wrong on the way to our image. I would suggest something which is somewhat time-consuming but a one time job: Based on Alpine or Debian creating an minimalistic image with only the dependencies that are needed for pact broker. (Alpine is about 5MB or so...) I'm not a docker pro, though I have read lots about this way of creating an image for own purposes and I think this is a good way to get rid of the baggage we got here from the base image. This is also suggested in the link bethesque already posted here:

This might contain some useful info: https://semaphoreci.com/blog/2016/12/13/lightweight-docker-images-in-5-steps.html

Alpine may be the better choice. The description in the docker-store gives an example:

FROM alpine:3.5
RUN apk add --no-cache mysql-client
ENTRYPOINT ["mysql"]

This example has a virtual image size of only 36.5MB. Compare that to our good friend Ubuntu:

FROM ubuntu:16.04
RUN apt-get update \
    && apt-get install -y --no-install-recommends mysql-client \
    && rm -rf /var/lib/apt/lists/*
ENTRYPOINT ["mysql"]

This yields us a virtual image size of about 184MB image.

Also the postgres image with alpine and the standard one (based on Debian):

9.3-alpine 14 MB latest (Linux Default)117 MB

(Sorry for editing this comment so much...)

gammabowl commented 6 years ago

@DerKnecht - thanks for the insightful info.

bethesque commented 6 years ago

I'm all for making the image as small as possible, but I'm curious as to what the impact is of the larger one, given that it only gets updated every few weeks or so at the moment. Can someone explain the use case? Just to let you know, this is not something I have time to work on at the moment (every bit of spare time I have goes to PB features), but I'm very happy to accept PRs.

gammabowl commented 6 years ago

It's not much of an issue, it was just an observation. Anyway it's not that we download the image every now and then. Once it's setup and deployed, we don't need to update the image unless there is an update. I understand about your time, as this is not a priority right now, that's the reason I mentioned I can pitch in for it as it's not something urgent and can be done at leisure time.

mefellows commented 6 years ago

FWIW you could take a look at https://github.com/pact-foundation/pact-mock-service-docker/blob/master/Dockerfile and https://github.com/DiUS/pact-provider-verifier-docker/blob/master/Dockerfile for inspiration (we converted them to Alpine and delt with the Ruby issues).

The broker image is a little more complicated with Nginx etc., so if time permits we can look into it but as Beth said, it's not a big issue/priority for us.

PR's absolutely welcome though :)

k-ong commented 6 years ago

I agree it does have some complexities around Nginx as we probably need to compile from source. and with that said, I might be doing something wrong but my image with alpine is growing to ~600Mb :(

I will be grateful to see if others have done a better job :)

k-ong commented 6 years ago

wip broker docker image with alpine: https://github.com/DiUS/pact_broker-docker/tree/broker-alpine

the image however, is sitting at 742Mb. will now attempt to slim it down but also looking for feedback/comments

mefellows commented 6 years ago

Having spent a lot of time working with Docker I've seen the community go back and forwards between small (read: Alpine) and mature (eg Debian).

I like small, but I also like well tested, high performance, and easy to understand (and therefore contribute).

This shouldn't stop any efforts to reduce the size, but we should approach with open eyes. A lot of people use this image and phusion passenger is well tested and optimised for ruby workloads. It would be a shame to lose out on this goodness for saving a miniscule amount of bandwidth when the image is updated.

So might I suggest before we make any official changes that we do some due diligence by looking also at performance and any functional losses (eg runtime management tools that we get from the Passenger images).

bethesque commented 6 years ago

We could always keep the smaller image on a fork and release it with a separate tag to give people the option.

k-ong commented 6 years ago

after continued work on the alpine version of the pact broker, I do not believe that the effort is justifiable for an image in alpine. The alpine image not much smaller than it's ubuntu counterpart and could be due to the fact that we have to compile nginx and passenger from source.

the complexity and effort of maintaining the nginx and phusion passenger compilation and installation far outweighs the benefit of a few MBs of the uncompressed image.

MeadowValley commented 6 years ago

@k-ong how much space does the ruby:2.4-alpine take? I couldn't find it in dockerstore. "2.4.2-alpine3.6" has 33 MB. (Edit: I already found it... it has only 28MB)

I really wonder what exactly makes our image so heavy. Is this just from nginx and passenger?

Though it's true that this seems to be much more effort that is needed for some MB than thought. mefellows is also right that we also should have a look at potential losses in functionality and performance when we're creating a slim-image. Even if we find a solution that needs constant maintaining but is very slim that may be not better than having some hundred MB more in our image.

YOU54F commented 5 years ago

Hey up,

I was looking into this using a couple of tools, and noted that the expanded pact-broker image does have a-lot of wasted space, so there is room for improvement. I don't think going down the route of a pure slimmed alpine image is the right way to go, for various reasons as outlined by some above.

Will dig into this when I get more time

Image Name Compressed Size Uncompressed Size Potential Wasted Space Image Efficiency
dius/pact-broker 362.2 MB 1.05GB 452 MB 76 %
phusion/passenger-ruby24 239.8 MB 708MB 94MB 91%
phusion/passenger-customizable 199.8MB 584MB 45MB 94%
above + ruby 2.4.5 install script   850MB 214 MB 83%
alpine slimmed 255MB 91MB 8MB 98%

 

https://microbadger.com for checking layers of compressed public images

 

https://github.com/wagoodman/dive for uncompressed images, step through each layer and see the impact of each operation.

 

 

YOU54F commented 5 years ago

also passenger images now ship with ubuntu 18.04 so the forced OS upgrade is probably unneccessary. Will get some more time soon to have a play at getting rid of some of the wasted space in the main pact-broker image as a starting point.

bethesque commented 5 years ago

That would be awesome, thanks.

YOU54F commented 5 years ago

If we cleanup on the same line as the upgrade, we get a pretty darn efficient image 👌

RUN apt-get update & \
    apt-get upgrade -y -o Dpkg::Options::="--force-confold" & \
    apt-get -qy autoremove & \
    apt-get clean & \
    rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
Total Image size: 795 MB
Potential wasted space: 31 MB
Image efficiency score: 96 %
YOU54F commented 5 years ago

baseimage-docker which the main image is built on, do the same here

https://github.com/phusion/baseimage-docker/blob/master/image/bin/install_clean

bethesque commented 5 years ago

Awesome. Am doing a new release with the merged PR.

bethesque commented 5 years ago

@YOU54F, @k-ong picked up that you're actually running those commands in the background (&), rather than sequentially (&&). Here's the corrected syntax.

RUN apt-get update && \
    apt-get upgrade -y -o Dpkg::Options::="--force-confold" && \
    apt-get -qy autoremove && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

Did you end up submitting that PR to the original source? You might want to update it.

YOU54F commented 5 years ago

Good spot @k-ong, my apologies all

I did submit a PR and will update.

& will indeed run in parallel and && will only run the subsequent command if the former passes

bethesque commented 5 years ago

There is a new release of the Docker Pact Broker, now at https://hub.docker.com/r/pactfoundation/pact-broker/tags 98MB compressed! Try out the edge tag. When it's been battle tested for a while, I'll put it out on latest. It uses puma and alpine linux instead of passenger phusion.