Simplify deployments/upgrades

alxndrsn commented 1 year ago

Ideas etc. for simplifying the deployment/upgrade process:

cut git out of the install?
pre-building all images?
- this is what we already do for the SMTP server image, although build process is currently manual
- make builds reproducible
- make installs/upgrades faster(?)
- less disk space required on host machines(?)
- use github container registry to serve images?
- may have to allow more configuration to be done via env var, e.g.
- changing database
- disabling sentry
- different SSL termination options(?)

matthew-white commented 1 year ago

pre-building all images?

See also #165 and #249.

may have to allow more configuration to be done via env var, e.g.

We don't document this, but it's currently possible to configure Frontend by modifying src/config.js (you can see that on the QA server, for example).

matthew-white commented 1 year ago

may have to allow more configuration to be done via env var, e.g.

changing database

Related: #389

spwoodcock commented 11 months ago

Currently I build a custom image for the central-backend. I have to duplicate the Dockerfile in my repo, as using a setup like this won't work with submodules:

services:
  central:
    build: https://raw.githubusercontent.com/getodk/central/master/service.dockerfile
    ...

The COPY directives will all fail due to the submodule files not being present.

I would propose a solution like this:

ARG node_version=18

FROM docker.io/bitnami/git:2 as repo
ARG ODK_CENTRAL_TAG
RUN git clone --depth 1 --branch ${ODK_CENTRAL_TAG} \
    "https://github.com/getodk/central.git" \
    && cd central && git submodule update --init

FROM docker.io/node:${node_version}-slim
WORKDIR /usr/odk
COPY --from=repo central/files/service/crontab /etc/cron.d/odk
COPY --from=repo central/files/service/scripts/ ./
...

This would be a solution to the first bullet point above: 'cut git out of the install'. Git would be used to init the submodules only in a multi-stage setup.

What are your thoughts?

alxndrsn commented 11 months ago

Currently I build a custom image for the central-backend. I have to duplicate the Dockerfile in my repo, as using a setup like this won't work with submodules:

Hi @spwoodcock, it would be great to understand better what you're changing in your custom docker image. Can you share a bit more context?

spwoodcock commented 11 months ago

Sure thing, the Dockerfile is:

ARG node_version=18

FROM docker.io/bitnami/git:2 as repo
ARG ODK_CENTRAL_TAG
RUN git clone --depth 1 --branch ${ODK_CENTRAL_TAG} \
    "https://github.com/getodk/central.git" \
    && cd central && git submodule update --init

FROM docker.io/node:${node_version}-slim

WORKDIR /usr/odk

COPY --from=repo central/files/service/crontab /etc/cron.d/odk
COPY --from=repo central/files/service/scripts/ ./
COPY --from=repo central/files/service/config.json.template /usr/share/odk/
COPY --from=repo central/files/service/odk-cmd /usr/bin/
# Add entrypoint script to init user
COPY init-user-and-start.sh /
# package.json must be added and installed prior to final COPY
COPY --from=repo central/server/package*.json ./

# Install system deps
RUN apt-get update && \
    apt-get install -y --no-install-recommends \
        curl \
        gpg \
        cron \
        wait-for-it \
        gettext \
        procps \
        postgresql-client \
        netcat-traditional \
    && rm -rf /var/lib/apt/lists/* \
    # Install node_modules
    && npm clean-install --omit=dev --legacy-peer-deps --no-audit \
        --fund=false --update-notifier=false \
    # Required to start via entrypoint
    && mkdir /etc/secrets sentry-versions \
    && echo 'jhs9udhy987gyds98gfyds98f' > /etc/secrets/enketo-api-key \
    && echo '1' > sentry-versions/server \
    && echo '1' > sentry-versions/central \
    && echo '1' > sentry-versions/client \
    # Set entrypoint executable
    && chmod +x /init-user-and-start.sh

# Add remaining files after deps installed
COPY --from=repo central/server/ ./

ENTRYPOINT ["/init-user-and-start.sh"]
EXPOSE 8383

# Add Healthcheck
HEALTHCHECK --start-period=10s --interval=5s --retries=10 \
    CMD nc -z localhost 8383 || exit 1

The main things I do are:

Init the submodules in the dockerfile first stage.
Combine RUN directives for a more efficient and small build.
I don't use enketo or sentry, so just add dummy secrets to start up.
I write my own entrypoint that creates an admin user at startup & then run the original entrypoint.
Add a HEALTHCHECK directive.

The entrypoint I use is:

#!/bin/bash

set -eo pipefail

# Wait for database to be available
wait-for-it "${CENTRAL_DB_HOST:-central-db}:5432"

### Init, generate config, migrate db ###
echo "Stripping pm2 exec command from start-odk.sh script (last 2 lines)"
head -n -2 ./start-odk.sh > ./init-odk-db.sh
chmod +x ./init-odk-db.sh

echo "Running ODKCentral start script to init environment and migrate DB"
echo "The server will not start on this run"
./init-odk-db.sh

### Create admin user ###
echo "Creating test user ${SYSADMIN_EMAIL} with password ***${SYSADMIN_PASSWD: -3}"
echo "${SYSADMIN_PASSWD}" | odk-cmd --email "${SYSADMIN_EMAIL}" user-create || true

echo "Elevating user to admin"
odk-cmd --email "${SYSADMIN_EMAIL}" user-promote || true

### Run server (hardcode WORKER_COUNT=1 for dev) ###
export WORKER_COUNT=1
echo "Starting server."
exec npx pm2-runtime ./pm2.config.js

lognaturel commented 9 months ago

We'd also like to remove pm2 and let the orchestration layer take care of managing service processes. I think the main reason to use pm2 originally was that Docker Compose didn't provide any functionality for horizontal scaling at the time. Now we could use replicas with a dynamic replica count, probably set from an env variable.

spwoodcock commented 9 months ago

Excellent idea!

I thought deploy.replicas was a Docker Swarm param, but looks like compose V2 added support 😄

Example config:

service:
  central:
    deploy:
      replicas: ${BACKEND_REPLICAS:-2}
      resources:
        limits:
          cpus: "0.9"
          memory: 1500M
        reservations:
          cpus: "0.1"
          memory: 100M

getodk / central

Simplify deployments/upgrades #347