rycus86 / podlike

Co-located containers as Docker Swarm services (like Kubernetes pods)
MIT License
81 stars 17 forks source link

Podlike

An attempt at managing co-located containers (like in a Pod in Kubernetes) mainly for services on top of Docker Swarm mode.

The general idea is the same as in Kubernetes: this container will act as a parent for the one or more children containers started as part of the emulated pod. Containers within this pod can use localhost (the loopback interface) to communicate with each other. They can also share the same volumes, and can also see each other's PIDs, so sending UNIX signals between containers is possible.

These are always shared:

By default, these are also shared, but optional:

Check out the blog post for a much more detailed introduction!

Use-cases

So, why would we want to do this on Docker Swarm?

  1. Sidecars

You may want to always deploy an application with a supporting application, a sidecar. For example, a web application you want to be accessed only through a caching reverse proxy, or with authentication enabled, but without implementing these in the application itself.

See also the sidecar example

There's another variation of this in the sidecar-init example folder that either uses depends_on or an init container to set up components in the pod.

  1. Signals

By putting containers in the same PID namespace, you send UNIX signals from one to another. Maybe an internal-only small webapp, that sends a SIGHUP to Nginx when it receives a reload request.

See also the signal example

  1. Log collectors

With two containers sharing a local volume, you could collect and forward logs from files, that another container is writing. Maybe you have a legacy application with fixed file logging, but you'd still want to use modern log forwarders, like Fluentd.

See also the logging example

  1. Shared volume and signals

By sharing a local volume for multiple containers, one could generate configuration for another to use, for example. Combined with singal sending, you could also ask the other app to reload it, when it is written and ready.

See also the volume example

  1. Health-checks

The example on the link below modernizes an application, by providing a composite HTTP health-check endpoint for a Java application, that only exposes liveness on JMX.

See also the health-check example

  1. Service meshes

Applications should implement business logic. With service meshes, we can externalize service discovery, routing, tracing concerns, and much more.

See also the service mesh and the modernized stack examples

See a more detailed explanation of the examples at https://blog.viktoradam.net/2018/05/24/podlike-example-use-cases/

Configuration

The controller needs to run inside a Docker containers, and it needs access to the Docker engine through the API (either UNIX socket, TCP, etc.). The list of components comes from container labels (not service labels). These labels need to start with pod.component.

For example:

version: '3.5'
services:

  pod:
    image: rycus86/podlike
    command: -logs
    labels:
      # sample app with HTML responses
      pod.component.app: |
        image: rycus86/demo-site
        environment:
          - HTTP_HOST=127.0.0.1
          - HTTP_PORT=12000
      # caching reverse proxy
      pod.component.proxy: |
        image: nginx:1.13.10
      # copy the config file for the proxy
      pod.copy.proxy: >
        /var/conf/nginx.conf:/etc/nginx/conf.d/default.conf
    configs:
      - source: nginx-conf
        target: /var/conf/nginx.conf
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
    ports:
      - 8080:80

configs:
  nginx-conf:
    file: ./nginx.conf
    # the actual configuration proxies requests from port 80 to 12000 on localhost

Or as a simple container for testing:

$ docker run --rm -it --name podtest                      \
    -v /var/run/docker.sock:/var/run/docker.sock:ro       \
    -v $PWD/nginx.conf:/etc/nginx/conf.d/default.conf:ro  \
    --label pod.component.app='
image: rycus86/demo-site
environment:
  - HTTP_HOST=127.0.0.1
  - HTTP_PORT=12000'                                      \
    --label pod.component.proxy='
image: nginx:1.13.10'                                     \
    -p 8080:80                                            \
    rycus86/podlike -logs

See the examples folder with more, small example stacks.

The properties of each component are the same ones a Compose project would accept, minus the unsupported ones (see below). This should make it easy to convert a Compose file into the configuration this app needs as a pod.component. label.

To make this more convenient, you can specify a Compose file to configure the components from, using the pod.compose.file label, which needs to point to a file inside the controller container. This will ignore any properties the app doesn't support, like ports, networking configuration, etc. (see below). This means, if you have a working Compose project, you're likely to be able to use it to feed the app, even without dropping the unsupported properties. You may still want to change things to work better as a group though.

Templates

To help reducing duplication in the stack YAML files, and to be able to share "pod" configuration between stack, you can use templates. These rely on extension fields in the stack's Compose file to set up the controller and the components in a more convenient way.

version: '3.5'

x-podlike-templates:
  - &component-template
    inline:
      main:
        labels:
          place: component
          svc.name: 'svc_{{ .Service.Name }}'

  - &sidecar-template
    inline:
      sidecar:
        image: sample/sidecar
        command: --port 8080

services:

  inline:
    image: sample/inline
    command: -exec
    x-podlike:
      pod:
        inline: |
          controller:
            image: rycus86/podlike:test
            command: -logs -pids
            labels:
              place: controller
      transformer:
        <<: *component-template
      templates:
        - <<: *sidecar-template

Have a look at the documentation to see what you can do with templates, then check out the podtemplate wrapper script that helps you automating the generation of templated stacks - or even deploying them in a single step.

HTTPS templates

A note on HTTPS if you're using it to fetch the templates from a server, or mabye from GitHub, chances are, you'd need the remote call to verify the certificates in the response. The Docker image is based on scratch, which only includes the application binary, so the verification would likely fail. To add the SSL certs, you can extend the image to include the necessary certificates, or disable the verification, though that is insecure. You can add in the certificates like this for example:

FROM debian as builder

RUN apt-get update && apt-get install -y ca-certificates

FROM rycus86/podlike  # best to use a specific version though

COPY --from=builder /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ca-certificates.crt

If you have private certificates, you can potentially add the to the builder image, run the certificate update, then copy the necessary files into the target image in a similar way.

Volumes

To share the controller's volumes with the components, the ones Swarm attached to the task, you have two options:

  1. Define the volume on the component as well with the same name
  2. Share all of the controllers volumes with all the components (less secure)

Note: Option 2 will likely include the Docker engine socket as well, so the components will be able to use it any way they want!

On versions 0.0.x and 0.1.x volume sharing was on by default, starting from 0.2.0 it is off. If you use latest and need the controller's volumes attached to the components, either define the volumes for the components that need it, or enable volume sharing with the -volumes=true command line flag.

The controller should be able to attach the requested volumes (for option 1 above) automatically, as long as you use the same name. There is a corner case for both Swarm and Compose though: if the volume has been given an explicit name, the container information will only see that, you'll probably still want to use it's reference, as shown on this example below:

version: '3.5'
services:

  pod:
    image: rycus86/podlike
    command: -logs # -volumes=false is optional and the default
    labels:
      pod.component.logger: |
        image: alpine
        command: tail -F /var/shared/log.out
        volumes:
          - swarm-volume:/var/shared
      pod.component.writer: |
        image: alpine
        command: >
          sh -c 'while [ true ]; do
          sleep 1 && echo "tick" >> /data/logs/log.out ;
          done'
        volumes:
          - swarm-volume:/data/logs
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - swarm-volume:/unused

volumes:
  swarm-volume:
    name: example-vol
    labels:
      com.github.rycus86.podlike.volume-ref: swarm-volume

The reference to use here is swarm-volume in this case, and is consistently used for the service definition and for the component definitions as well. Docker will actually store this image as example-vol, so we need to tell the controller how to find it. This is what the com.github.rycus86.podlike.volume-ref volume label is for. As a side-note, you can choose to use a different volume-ref for the components and use the original reference for the service, if you're not into consistency much.

Also note, that you don't have to share the volume with the controller necessarily. If you just use the same volume name on the components, Docker will just create one, and each of them will be able to use it. If you want it managed by Swarm though, maybe to be able to use templates, like name: 'volume-{{.Task.ID}}', then you also need to attach it to the controller, and set up the reference label for it.

Dragons!

This project is very much work in progress (see below). Even with all the tasks done, this will never enable full first-class support for pods on Docker Swarm the way Kubernetes does. Still, it might be useful for small projects or specific deployments.

I'm not yet sure how the components' containers will interfere with Swarm scheduling, resource allocation, etc. Memory limits are honored, but the components are limited to the controller's limits at most. Memory reservation is allowed on the components if you really want to, but comes with a warning. If you set the reservation on the controller, the cgroup should take note of this for you for all the containers.

I also haven't done extensive testing on other resource constraints, in terms of how they behave when running as part of a shared cgroup. For example, CPU and I/O (blkio) limits, ulimits, etc. Not sure yet how these settings would affect things overall, and the app doesn't necessarily try to validate them for you, so at this point, you'll have to try and see for yourself. But do let me know how it goes, please!

Some Swarm features are also hacked around, for example configs and secrets can be available to the controller container, but I haven't found easy way to share those with the component containers. These configuration can be copied at component startup, by adding a pod.copy.<name>=/source/file/in/controller:/dest/file/in/component label on the controller (see examples on how to define this in YAML here). It does mean, that on every startup or restart, these will be copied again, just be aware. Swarm service labels are also not available on container, and the controller doesn't assume it's running on a Swarm manager node, so we need to use container labels here, which is a bit of a shame.

Component reaping is done on a best-effort basis, killing the controller could leave you with zombie containers. With the components placed within the controller's cgroup, plus with PID sharing enabled, this is probably somewhat mitigated, but you could still potentialy end up having containers using memory and CPU after the controller dies. The components are also started with auto-remove, so getting information about them post-mortem might prove difficult.

Work in progress

Some of the open tasks are:

Unsupported properties

Any other properties from the v2 Compose file should be supported, and working as expected.

Command line usage

The application supports these command line flags, that you can pass to container or the service, using the command property if you're deploying from a stack YAML.

Usage of /podlike:
  -ipc
        Enable (default) or disable IPC sharing (default true)
  -logs
        Stream logs from the components
  -pids
        Enable (default) or disable PID sharing (default true)
  -pull
        Always pull the images for the components when starting
  -volumes
        Enable volume sharing from the controller

Alternatively, the healthcheck argument starts a one-off run that returns the current health status of the app running in the same container. Check the Dockerfile and the healthcheck/client.go source code to see how this works.

There is also version as a supported argument, that prints the version and build information of the Docker image built on Travis.

License

MIT