Exa-Networks / exabgp

The BGP swiss army knife of networking
Other
2.07k stars 443 forks source link

Docker container persistent config/announcements #1095

Closed SanderDelden closed 2 years ago

SanderDelden commented 2 years ago

I'm using Quagga as a BGP instance for use with Wanguard. Wanguard announces routes to ExaBGP using the pipe which all works fine. However when restarting the ExaBGP container the previously inserted announcements are no longer presented which can lead to some issues for us.

The configuration looks as follows:

Dockerfile:

FROM python:3.9.12-slim-buster

ARG EXABGP_VERSION

# update and install packages
RUN apt update && \
    apt install -y \
        dumb-init && \
    rm -rf /var/lib/apt/lists/* /var/cache/apt/*

# add exabgp
ADD https://github.com/Exa-Networks/exabgp/archive/refs/tags/${EXABGP_VERSION}.tar.gz /tmp
RUN mkdir /etc/exabgp && \
    tar -xzf /tmp/${EXABGP_VERSION}.tar.gz --strip-components=1 -C /etc/exabgp && \
    rm /tmp/${EXABGP_VERSION}.tar.gz

# copy healthcheck & entrypoint script
COPY $PWD/docker/exabgp/docker-healthcheck.sh $PWD/docker/exabgp/docker-entrypoint.sh /

ENTRYPOINT [ "dumb-init", "--" ]
CMD [ "./docker-entrypoint.sh" ]

Entrypoint:

#!/bin/bash

set -euo pipefail

PIPE_PATH=/run/exabgp/
IN_PIPE=exabgp.in
OUT_PIPE=exabgp.out

# check if exabgp pipe exists, if not add it
if [ ! -d "$PIPE_PATH" ]; then
  mkdir -p "$PIPE_PATH"
  chown nobody "$PIPE_PATH"
fi
if [ ! -p "${PIPE_PATH}${IN_PIPE}" ]; then
  mkfifo "${PIPE_PATH}${IN_PIPE}"
  chown nobody "${PIPE_PATH}${IN_PIPE}"
  chmod 600 "${PIPE_PATH}${IN_PIPE}"
fi
if [ ! -p "${PIPE_PATH}${OUT_PIPE}" ]; then
  mkfifo "${PIPE_PATH}${OUT_PIPE}"
  chown nobody "${PIPE_PATH}${OUT_PIPE}"
  chmod 600 "${PIPE_PATH}${OUT_PIPE}"
fi

# start exabgp
/etc/exabgp/sbin/exabgp /var/lib/exabgp/exabgp.conf

ExaBGP config file:

# default template

template {
    neighbor neighbor_group {
        router-id 172.20.0.20;
        local-address 172.20.0.20;
        local-as 12345;
        peer-as 12345;
        hold-time 180;
    }
}

# mock core routers

neighbor 172.20.0.21 {
    inherit neighbor_group;
    description "quagga-router1-mock";
}
neighbor 172.20.0.22 {
    inherit neighbor_group;
    description "quagga-router2-mock";
}

Compose file:

  exabgp:
    image: ${COMPOSE_PROJECT_NAME}/${EXABGP_DOCKER_IMAGE_NAME}:${EXABGP_VERSION}
    restart: unless-stopped
    networks:
      wanguard_network:
        ipv4_address: ${EXABGP_IP_ADDRESS}
    environment:
      EXABGP_VERSION: ${EXABGP_VERSION}
      TZ: Europe/Amsterdam
    healthcheck:
      test: [CMD, bash, /docker-healthcheck.sh]
      interval: 5s
      timeout: 10s
      retries: 10
      start_period: 10s
    volumes:
    - $PWD/exabgp/exabgp.conf:/var/lib/exabgp/exabgp.conf:ro
    - exabgp_data:/etc/exabgp
    - exabgp_shared:/run/exabgp

volumes:
  exabgp_data:
  exabgp_shared:

Any help on this would be greatly appreciated!

thomas-mangin commented 2 years ago

Good timing: I was working on the rib code recently (two last weeks). I wanted to change it so added some tests beforehand and found issues. I believe your issue may have already been resolved on master.

The fix will probably not be backported, as I try to not break behaviour on backport (even buggy ones).

A few things have changed with the API between 4.2 and master, but I do not believe it should affect flowspec and/or RTBH use of ExaBGP, therefore wanguard should be fine.

And as I have your attention: the pipe (which will be changed to a UNIX socket at some point) is an internal feature of exabgp, for the cli, and not part of the supported API. Users should really write their own helper programs (like the ones in the etc folder of the repo), and not use the pipe - unless they create their own - possibly using exabgp own internal code.

There are only a few things I want to do before releasing 5.x, as some of the features/changes would make the code incompatible and I want a new stable branch before doing more intrusive changes.

SanderDelden commented 2 years ago

Hi Thomas,

Thanks for the quick reply, much appreciated!

I've gone ahead and used master code to test this but unfortunately it seems the issue is still present. If any additional information such as debug logging is required please let me know. In regards to your pipe comment, thanks for the info. I will look into a better way of resolving this.

Kind regards, Sander

thomas-mangin commented 2 years ago

Hi Sander,

Having a bit more time today and re-reading your wording "when restarting the ExaBGP container the previously inserted announcements are no longer presented", I realise that this is expected behaviour.

ExaBGP, like all other routing daemons, does not store received routes (from peers, but in our case the "API" too) in anything but memory. So on reboot, all previous announcements will be lost. This is the same for the API.

We do not have a CLI configuration, so commands added via the API does not affect the boot configuration, so when you restart the container, they are lost. Until then, they are in the ADJ-RIB out for the peers and therefore re-sent if the peer disconnect and reconnect.

We also do not withdraw routes when the connection with programs are lost as we are really an SDN engine, not a generic bgp router. In your case the "API" is the program handling incoming commands, technically this "process" does not "die", so even if we did remove routes on failure (which we do not), we would not in that case anyway.

The solution would be to keep track of what was sent in your code (for example saving a file per route) and replaying them on start of the container. It will also mean that you will need to remove the route when you want to perform a route withdrawal.

There is no mechanism in ExaBGP at the moment to dump and load the ADJ-RIB .. Sorry.

SanderDelden commented 2 years ago

Hi Thomas,

That clears things up, thank you very much for the explanation. I will indeed look into keeping track of the announcements so they can be added again upon a (re)start of the container.

Kind regards, Sander