elastic / logstash

Logstash - transport and process your logs, events, or other data
https://www.elastic.co/products/logstash
Other
14.22k stars 3.5k forks source link

Dockerize ElasticSearch for ITs #7097

Closed original-brownbear closed 7 years ago

original-brownbear commented 7 years ago

It's in the title, move the setup and teardown of the integration test ES service to Docker like it was done in #7093 for Kafka.

wainersm commented 7 years ago

Hi @suyograo and @original-brownbear , if there is any other item about this dockerization work, let me know. I can help on this.

@original-brownbear : I am curious why you building the docker images with the services (e.g. installing kafka) instead of pull already built images.

suyograo commented 7 years ago

@wainersm all yours now :)

wainersm commented 7 years ago

Thanks for the gift @suyograo :)

original-brownbear commented 7 years ago

@wainersm thanks for the help!

As far as your question goes:

I am curious why you building the docker images with the services (e.g. installing kafka) instead of pull already built images.

Two reasons:

More importantly:

BTW, regarding the second point, feel free to "dry up" your Dockerfile against the one I have added for Kafka by moving lines in the Kafka one around. If you do something like

turning:

FROM debian:stretch

ENV KAFKA_HOME /kafka
ENV KAFKA_LOGS_DIR="/kafka-logs"
ENV KAFKA_VERSION 0.10.2.1
ENV _JAVA_OPTIONS "-Djava.net.preferIPv4Stack=true"
ENV TERM=linux

RUN apt-get update && apt-get install -y curl openjdk-8-jre-headless netcat

RUN mkdir -p ${KAFKA_LOGS_DIR} && mkdir -p ${KAFKA_HOME} && curl -s -o $INSTALL_DIR/kafka.tgz \
    "http://ftp.wayne.edu/apache/kafka/${KAFKA_VERSION}/kafka_2.11-${KAFKA_VERSION}.tgz" && \
    tar xzf ${INSTALL_DIR}/kafka.tgz -C ${KAFKA_HOME} --strip-components 1

ADD run.sh /run.sh

EXPOSE 9092
EXPOSE 2181

into

FROM debian:stretch

ENV _JAVA_OPTIONS "-Djava.net.preferIPv4Stack=true"
ENV TERM=linux

RUN apt-get update && apt-get install -y curl openjdk-8-jre-headless netcat

ENV KAFKA_HOME /kafka
ENV KAFKA_LOGS_DIR="/kafka-logs"
ENV KAFKA_VERSION 0.10.2.1
RUN mkdir -p ${KAFKA_LOGS_DIR} && mkdir -p ${KAFKA_HOME} && curl -s -o $INSTALL_DIR/kafka.tgz \
    "http://ftp.wayne.edu/apache/kafka/${KAFKA_VERSION}/kafka_2.11-${KAFKA_VERSION}.tgz" && \
    tar xzf ${INSTALL_DIR}/kafka.tgz -C ${KAFKA_HOME} --strip-components 1

ADD run.sh /run.sh

EXPOSE 9092
EXPOSE 2181

You should be able to reuse most lines :)

wainersm commented 7 years ago

@original-brownbear Talking about reuse, have you guys thought about using docker compose?

I made an experiment with docker compose so that the elasticsearch image is built from a base image (calling "sandbox"). Once you run docker-compose up both base and elasticsearch images are built and a container started afterwards. Such as strategy can improve reuse and make dockerfiles for the services minimal. But I neither have investigated the implications of image caching in Travis nor availability of ruby libs for docker compose.

Here is the diffs: https://gist.github.com/wainersm/c2c578a4487a6da7af5017c442ebbc3d

In the end I just wanted to share my experiment and hear opinions. I will dockerize elasticsearch following @original-brownbear 's blueprint for kafka.

original-brownbear commented 7 years ago

@wainersm I'm not opposed to using docker-compose in any way :) (especially since you can nowadays even dockerized it itself : https://hub.docker.com/r/docker/compose/ :)) (I will look into your patch soon btw, give me a few hours just finished travelling Europe -> West Coast and need to wake up real quick :))

original-brownbear commented 7 years ago

@wainersm alright looked over the patch. Generally it's exactly what I was looking for :)

"But":

Other than that, love it and think we should merge this as soon as it runs from Ruby like the Kafka one does + add an issue that then makes the Kafka one also work from this base image approach with the ONBUILD :)

wainersm commented 7 years ago

Great! I'm working on an initial version, soon I push it for review.

wainersm commented 7 years ago

hi @original-brownbear ! The PR #7234 is the first draft. I believe it's almost done so I am sending for review.

There are two features I still need to include in:

  1. share the test fixture folder with the container. While it is not used by elastisearch itself, it is by kafka and filebeat.
  2. allows to export environment variables at image/container build time. For example, I might want to installed a different version of elasticsearch by exporting ES_VERSION.

Here goes the output of elastichsearch ci test:

wainersm@wainersm-laptop:~/projects/oss/logstash/qa/integration$ rspec ./specs/es_output_how_spec.rb 

Test Elasticsearch output
Using /home/wainersm/projects/oss/logstash/build/logstash-6.0.0-alpha2-SNAPSHOT as LS_HOME
Setting up services
Setting up logstash service
Setup script not found for logstash
logstash service setup complete
Setting up elasticsearch service.
Building the base container image.
Finished building the base image.
Building the container image.
Finished building the image.
Starting the container.
Finished starting the container.
Finished setting up elasticsearch service.
ERROR StatusLogger No log4j2 configuration file found. Using default configuration: logging only errors to the console.
  can ingest 37K log lines of sample apache logs
Tearing down services
Tearing down elasticsearch service.
Stop the container.
Finished stopping the container.
Finished tearing down of elasticsearch service.

Finished in 4 minutes 6.7 seconds (files took 1.03 seconds to load)
1 example, 0 failures

wainersm@wainersm-laptop:~/projects/oss/logstash/qa/integration$ docker ps -a
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
wainersm@wainersm-laptop:~/projects/oss/logstash/qa/integration$ docker images
REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
<none>              <none>              cd0869bad1fe        9 minutes ago       349 MB
logstash            ci_sandbox          20208c377f3f        10 minutes ago      279 MB
<none>              <none>              70ddd3011c73        12 minutes ago      358 MB
debian              stretch             4594f2fd77bf        2 weeks ago         100 MB
original-brownbear commented 7 years ago

@wainersm thanks will take a look very soon!

original-brownbear commented 7 years ago

@wainersm took a quick look and generally this is very nice work, exactly what I had in mind, thanks! :) Added 2 trivial comments to the code in the PR.

About your two points:

share the test fixture folder with the container. While it is not used by elastisearch itself, it is by kafka and filebeat.

I don't think I understand this point, maybe I don't have to though and it'll all make sense in time :D But on the off chance it helps here two tips/points:

  1. You can trivially add any path to the build of an image by using the insert_local method on it like I did in the Kafka thing
Docker::Image.build_from_dir(File.expand_path("../kafka_dockerized", __FILE__))
                     .insert_local(
                       'localPath' => File.join(TestSettings::FIXTURES_DIR, "how_sample.input"),
                       'outputPath' => '/')
  1. (just on the off chance :D ): Please don't use Docker volumes instead of inserting things into images for the build (just in case, volumes are a pain with Travis)

allows to export environment variables at image/container build time. For example, I might want to installed a different version of elasticsearch by exporting ES_VERSION.

Don't worry too much about this yet, we have #7100 open as well. If you're interested I can assign that one to you and you can look into this there. If you're not interested someone else can no pressure :D, but I think that issue is a nice entry-point to adding dynamic versioning :)