sqitchers / docker-sqitch

Docker Image packaging for Sqitch
MIT License
36 stars 39 forks source link

mysql, postgres, and sqlite (for review not ready for merge) #29

Open pyramation opened 4 years ago

pyramation commented 4 years ago

@theory so far I was able to successfully build

sqitch/sqitch       1.1.0-alpine-sqlite     eb694035610f   74.6MB
sqitch/sqitch       1.1.0-alpine-mysql      59117ef926ec   75.4MB
sqitch/sqitch       1.1.0-alpine-postgres   da65547ced72   68.8MB

I copied the build file from the base and put a Makefile in ./alpine as to not mess around with your build process until we collaborate on what the idea build setup looks like.

Once we sort out the build process and review these, we can begin to try adding the other databases that may have harder to find packages in alpine.

theory commented 4 years ago

Looks good though we may want to split out the common bits to a separate layer to reduce duplication and runtime. Be sure also to include a text editor for editing change messages. I used nano in the Debian build, as it's tiny and straightforward to use.

pyramation commented 4 years ago

To reduce docker size, I attempted to remove a few RUN commands to help reduce layers. As far as de-duplication via a base image, I'm not sure how to properly do it as there are database-specific packages and builds that happen throughout. I would imagine if we could achieve de-duplication, we'd be able to import a base image, however, the very first command installs packages specific to each database. Please let me know if there's a workaround.

Just fyi, I added nano, which builds a healthy 6MB into every image. I suppose it's not a lot but worth taking note:

w nano

1.1.0-alpine-mysql                                  80.8MB
1.1.0-alpine-sqlite                                 79.9MB
1.1.0-alpine-postgres                               74.2MB

w/o nano

1.1.0-alpine-mysql                                     74.8MB
1.1.0-alpine-sqlite                                    73.9MB
1.1.0-alpine-postgres                                  68.1MB
theory commented 4 years ago

I would imagine if we could achieve de-duplication, we'd be able to import a base image, however, the very first command installs packages specific to each database.

Yes, that's exactly what I was thinking. You'd make a base image (that would never be pushed anywhere) that has Sqitch but no database-specific drivers or clients. Then you build each consecutive image FROM that image. That base image would need its own Dockerfile and be built and named locally before you built any others. I imagine it'd look something like this:

FROM alpine:3.11 AS sqitch-build

 ARG VERSION
 ENV PERL5LIB /work/local/lib/perl5
 ENV TZ UTC

 # Install system dependencies.
 WORKDIR /work
 RUN mkdir -p /usr/share/man/man1 /usr/share/man/man7 \
     && apk add --no-cache --virtual .build-deps \
         alpine-sdk \
         perl-dev \
         curl \
         tzdata \ 
         gnupg \
     && apk add --no-cache perl \
     && curl -LO https://www.cpan.org/authors/id/D/DW/DWHEELER/App-Sqitch-v$VERSION.tar.gz \
     && mkdir src \
     && tar -zxf App-Sqitch-v$VERSION.tar.gz --strip-components 1 -C src \
     # Install cpan and build dependencies.
     && curl -sL --compressed https://git.io/cpm > cpm && chmod +x cpm \
     && ./cpm install -L local --verbose --no-test ExtUtils::MakeMaker \
     && ./cpm install -L local --verbose --no-test --with-recommends \
         --with-configure --cpanfile src/dist/cpanfile \
     && cp /usr/share/zoneinfo/UTC /etc/localtime \
    && echo UTC > /etc/timezone \
     # Build, test, bundle, prune.
     && cd /work/src \
     && perl Build.PL --quiet --install_base /app --etcdir /etc/sqitch \
     --config installman1dir= --config installsiteman1dir= --config installman3dir= --config installsiteman3dir= \
     --with mysql \
     && ln -s  /usr/include/ibase.h \
     && ./Build test && ./Build bundle \
     && rm -rf /app/man \
     && find /app -name '*.pod' | grep -v sqitch | xargs rm -rf \
     && apk del .build-deps

This passes no --with option , so no additional engine modules should be installed. You can then install them manually for each engine Dockerfile by reading the modules to install from dist.ini. For example, here's the list of additional modules for Postgres support:

https://github.com/sqitchers/sqitch/blob/f196d27f5c828d34d3e84ac07a939c43774ce20f/dist.ini#L83-L87

So then, to build the Postgres-supporting image, you have the Postgres Dockerfile do something like:

FROM sqitch-base AS sqitch-build
cpanm DBD::Pg

FROM alpine:3.11 AS sqitch
# AS before...

I'm surprised nano is 6MB. ISTR when I added it to the Debian build it was around 1MB. I wonder if it requires a C library or something on Alpine. Makes sense since it's [only 128K itself[(https://pkgs.alpinelinux.org/package/v3.3/main/x86/nano). I wonder if there's a smaller editor on Alpine, or a way to get it without a big library dependency.

theory commented 3 years ago

Hey @pyramation, where do things stand on this PR? Still in progress?

pyramation commented 3 years ago

Other than splitting out into separate parts for de-duplication, were there other issues with it? I wasn't sure how to manage your CI portion, happy to adjust things to work with your system.

Regarding the common/docker issue: I wasn't completely sure how to cleanly pull out common parts because we have a lot of non-common things scattered through the dockerfile, like --with mysql, perl-dbd-mysql, etc. If we did try to, my fear would be bloating image since since they're all purposely grouped into as few commands possible reduce the number of layers.

theory commented 3 years ago

The idea is to put all the common stuff into a common image layer, then have additional Dockerfiles that build on that base image, where they add the stuff unique to each engine.

pyramation commented 3 years ago

@theory I totally understand. What I'm suggesting is that there really isn't much in common, because there are non-common flags scattered from top to bottom that cannot be moved based on installing sqitch and it's dependencies.

For example... looking at postgres. The very first thing you have to run involves a special flag: https://github.com/sqitchers/docker-sqitch/blob/46d7f8c225c57fae0ee173b2818d74cd86f84ca1/alpine/postgres/Dockerfile#L14-L15

and then the build: https://github.com/sqitchers/docker-sqitch/blob/46d7f8c225c57fae0ee173b2818d74cd86f84ca1/alpine/postgres/Dockerfile#L33

So if we wanted to share anything from postgres and mysql, for example... look at the first thing we have to do for mysql: https://github.com/sqitchers/docker-sqitch/blob/46d7f8c225c57fae0ee173b2818d74cd86f84ca1/alpine/mysql/Dockerfile#L14-L15

also the 2nd flag required for mysql: https://github.com/sqitchers/docker-sqitch/blob/46d7f8c225c57fae0ee173b2818d74cd86f84ca1/alpine/mysql/Dockerfile#L33

To my knowledge, the only way to make a shared base layer would be to actually install all of the modules and flags. However, I don't believe that's a good solution because most folks only use one database. That would also inflate the images and I would advise keeping images super lean.

Maybe you know of a way to install sqitch without these flags? I remember spending a lot of time wrestling with Perl, and I believe all of those flags are required in that particular order.

theory commented 3 years ago

Yeah, doing something like this would probably require some sort of templating system to generate all the appropriate Dockerfiles. Which would sure be a PITA.

How much work do you think it needs to look to merge your changes?

pyramation commented 3 years ago

Well, they are working as is, but I notice that the CI/CD is not passing. I imagine steps left are

theory commented 3 years ago

Is that something you can work on, @pyramation?