Closed gammabowl closed 5 years ago
That does seem ridiculous! There's not that much going on in it. I wonder if it's all the extra rubies that come installed with phusion-passenger.
I'm assuming the 819MB you quote above is the uncompressed size.
The compressed size of the ruby passenger base image is 244MB. The compressed size of the pact-broker image is 294MB. So the bulk of it comes from the base image. I'm not a docker expert, so I'm happy to take a PR if you could do some research into what might be removed from the passenger base image.
This might contain some useful info: https://semaphoreci.com/blog/2016/12/13/lightweight-docker-images-in-5-steps.html
@bethesque - I am not aware of the ruby world, but I have fairly good experience with docker, I could give it a shot. if you have any links which I could use, please do let me know. I can take this up if its fine with you guys ?
That would be awesome, thanks! I
I'm sure you guys, especially shashidesai, already know this, but in most cases you should prefer the COPY command over ADD. https://www.ctl.io/developers/blog/post/dockerfile-add-vs-copy/
I'm a total docker rookie. Very happy to accept PRs from people who know more than I do.
the pact-broker is built from the phusion passenger container: phusion/passenger-ruby24:0.9.26 is at 659MB uncompressed.
It definitely seems to be on the heavier side, but comparing it to the postgres image is slightly misleading.
Yep, I get that comparing it to postgres is slightly misleading but for a small app like pact broker I thought 819MB was too large an image. I was trying to experiment by not using the phusion/passenger image (which is already heavier), let me know what you think about that ? If it doesn't make sense, then we could skip and close this issue ?
This indeed makes sense. I would like to have a lightweight docker image, too. But I think it does not make sense to compare the phusion-image to an postgres-image. They're not comparable.
The phusion is based on ubuntu, which isn't kinda heavy but something seems to be wrong on the way to our image. I would suggest something which is somewhat time-consuming but a one time job: Based on Alpine or Debian creating an minimalistic image with only the dependencies that are needed for pact broker. (Alpine is about 5MB or so...) I'm not a docker pro, though I have read lots about this way of creating an image for own purposes and I think this is a good way to get rid of the baggage we got here from the base image. This is also suggested in the link bethesque already posted here:
This might contain some useful info: https://semaphoreci.com/blog/2016/12/13/lightweight-docker-images-in-5-steps.html
Alpine may be the better choice. The description in the docker-store gives an example:
FROM alpine:3.5 RUN apk add --no-cache mysql-client ENTRYPOINT ["mysql"]
This example has a virtual image size of only 36.5MB. Compare that to our good friend Ubuntu:
FROM ubuntu:16.04 RUN apt-get update \ && apt-get install -y --no-install-recommends mysql-client \ && rm -rf /var/lib/apt/lists/* ENTRYPOINT ["mysql"]
This yields us a virtual image size of about 184MB image.
Also the postgres image with alpine and the standard one (based on Debian):
9.3-alpine 14 MB latest (Linux Default)117 MB
(Sorry for editing this comment so much...)
@DerKnecht - thanks for the insightful info.
I'm all for making the image as small as possible, but I'm curious as to what the impact is of the larger one, given that it only gets updated every few weeks or so at the moment. Can someone explain the use case? Just to let you know, this is not something I have time to work on at the moment (every bit of spare time I have goes to PB features), but I'm very happy to accept PRs.
It's not much of an issue, it was just an observation. Anyway it's not that we download the image every now and then. Once it's setup and deployed, we don't need to update the image unless there is an update. I understand about your time, as this is not a priority right now, that's the reason I mentioned I can pitch in for it as it's not something urgent and can be done at leisure time.
FWIW you could take a look at https://github.com/pact-foundation/pact-mock-service-docker/blob/master/Dockerfile and https://github.com/DiUS/pact-provider-verifier-docker/blob/master/Dockerfile for inspiration (we converted them to Alpine and delt with the Ruby issues).
The broker image is a little more complicated with Nginx etc., so if time permits we can look into it but as Beth said, it's not a big issue/priority for us.
PR's absolutely welcome though :)
I agree it does have some complexities around Nginx as we probably need to compile from source. and with that said, I might be doing something wrong but my image with alpine is growing to ~600Mb :(
I will be grateful to see if others have done a better job :)
wip broker docker image with alpine: https://github.com/DiUS/pact_broker-docker/tree/broker-alpine
the image however, is sitting at 742Mb. will now attempt to slim it down but also looking for feedback/comments
Having spent a lot of time working with Docker I've seen the community go back and forwards between small (read: Alpine) and mature (eg Debian).
I like small, but I also like well tested, high performance, and easy to understand (and therefore contribute).
This shouldn't stop any efforts to reduce the size, but we should approach with open eyes. A lot of people use this image and phusion passenger is well tested and optimised for ruby workloads. It would be a shame to lose out on this goodness for saving a miniscule amount of bandwidth when the image is updated.
So might I suggest before we make any official changes that we do some due diligence by looking also at performance and any functional losses (eg runtime management tools that we get from the Passenger images).
We could always keep the smaller image on a fork and release it with a separate tag to give people the option.
after continued work on the alpine version of the pact broker, I do not believe that the effort is justifiable for an image in alpine. The alpine image not much smaller than it's ubuntu counterpart and could be due to the fact that we have to compile nginx and passenger from source.
the complexity and effort of maintaining the nginx and phusion passenger compilation and installation far outweighs the benefit of a few MBs of the uncompressed image.
@k-ong how much space does the ruby:2.4-alpine take? I couldn't find it in dockerstore. "2.4.2-alpine3.6" has 33 MB. (Edit: I already found it... it has only 28MB)
I really wonder what exactly makes our image so heavy. Is this just from nginx and passenger?
Though it's true that this seems to be much more effort that is needed for some MB than thought. mefellows is also right that we also should have a look at potential losses in functionality and performance when we're creating a slim-image. Even if we find a solution that needs constant maintaining but is very slim that may be not better than having some hundred MB more in our image.
Hey up,
I was looking into this using a couple of tools, and noted that the expanded pact-broker image does have a-lot of wasted space, so there is room for improvement. I don't think going down the route of a pure slimmed alpine image is the right way to go, for various reasons as outlined by some above.
Will dig into this when I get more time
Image Name | Compressed Size | Uncompressed Size | Potential Wasted Space | Image Efficiency |
---|---|---|---|---|
dius/pact-broker | 362.2 MB | 1.05GB | 452 MB | 76 % |
phusion/passenger-ruby24 | 239.8 MB | 708MB | 94MB | 91% |
phusion/passenger-customizable | 199.8MB | 584MB | 45MB | 94% |
above + ruby 2.4.5 install script | 850MB | 214 MB | 83% | |
alpine slimmed | 255MB | 91MB | 8MB | 98% |
https://microbadger.com for checking layers of compressed public images
https://github.com/wagoodman/dive for uncompressed images, step through each layer and see the impact of each operation.
also passenger images now ship with ubuntu 18.04 so the forced OS upgrade is probably unneccessary. Will get some more time soon to have a play at getting rid of some of the wasted space in the main pact-broker image as a starting point.
That would be awesome, thanks.
If we cleanup on the same line as the upgrade, we get a pretty darn efficient image 👌
RUN apt-get update & \
apt-get upgrade -y -o Dpkg::Options::="--force-confold" & \
apt-get -qy autoremove & \
apt-get clean & \
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
Total Image size: 795 MB
Potential wasted space: 31 MB
Image efficiency score: 96 %
baseimage-docker which the main image is built on, do the same here
https://github.com/phusion/baseimage-docker/blob/master/image/bin/install_clean
Awesome. Am doing a new release with the merged PR.
@YOU54F, @k-ong picked up that you're actually running those commands in the background (&
), rather than sequentially (&&
). Here's the corrected syntax.
RUN apt-get update && \
apt-get upgrade -y -o Dpkg::Options::="--force-confold" && \
apt-get -qy autoremove && \
apt-get clean && \
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*
Did you end up submitting that PR to the original source? You might want to update it.
Good spot @k-ong, my apologies all
I did submit a PR and will update.
& will indeed run in parallel and && will only run the subsequent command if the former passes
There is a new release of the Docker Pact Broker, now at https://hub.docker.com/r/pactfoundation/pact-broker/tags 98MB compressed! Try out the edge
tag. When it's been battle tested for a while, I'll put it out on latest
. It uses puma and alpine linux instead of passenger phusion.
The postgres docker image used with pact broker is just 270MB. So relative to that pact broker seems to be on the heavier side. Was wondering if its possible to reduce the docker image size for pact broker ?