mohsenasm / swarm-dashboard

A Simple Monitoring Dashboard for Docker Swarm Cluster
MIT License
162 stars 26 forks source link

Need a proper ARM build #30

Closed trajano closed 1 year ago

trajano commented 1 year ago

I tried to deploy the build on a Raspberry Pi

WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
exec /sbin/tini: exec format error

I suspect it's because you're supporting LetsEncrypt and the binaries you are using to run the cron are not platform neutral

trajano commented 1 year ago

Here's a snippet from my old project that builds multiarch

https://github.com/trajano/spring-cloud-demo/blob/78ad12e7db4198fbedd94d2b2a4e87f5ae5ff187/.github/workflows/publish.yml#L63-L104

You'd likely need that in the end but there's still the first problem of ensuring you have a multi-arch base for tini. This one shows a way of doing it https://github.com/instructure/dockerfiles/blob/master/tini/v0.19.0/Dockerfile

mohsenasm commented 1 year ago

We are using apk to install tini, and it apparently supports ARM. https://pkgs.alpinelinux.org/packages?name=tini&branch=edge&repo=&arch=&maintainer= https://pkgs.alpinelinux.org/packages?name=lego&branch=edge&repo=&arch=&maintainer=

We should probably just change the docker push workflow and Lego installation.

trajano commented 1 year ago

If we follow the same pattern as that tini I guess we can simply do a if condition.

mohsenasm commented 1 year ago

I have added support for linux/arm64/v8, but I don't have a Raspberry Pi to test it. Docker image: mohsenasm/swarm-dashboard:dev_multiarch


I also tried adding support for linux/arm/v7 and linux/arm/v6, but yarn install didn't finish even after two hours. It also didn't produce any logs, even with the verbose flag. The problem seems to be because of QEMU Emulation (https://github.com/nodejs/docker-node/issues/1335) or attempting to build the modules from the source (https://stackoverflow.com/questions/24961623).

trajano commented 1 year ago

Tried it's on a restart loop. Error 143

mohsenasm commented 1 year ago

This can be a healthcheck issue. I temporarily disabled the healthcheck to see if it is the case.

Is there any related log in the docker service logs or docker container inspect <id_of_the_exited_container>?

trajano commented 1 year ago

I did --no-healthcheck​ it just stops after a while.


From: 'Mohammad-Mohsen Aseman-Manzar' via Open Source Development @.> Sent: Thursday, October 26, 2023 10:16 AM To: mohsenasm/swarm-dashboard @.> Cc: Archimedes Trajano @.>; Author @.> Subject: Re: [mohsenasm/swarm-dashboard] Need a proper ARM build (Issue #30)

This can be a healthcheck issue. I temporarily disabled the healthcheck to see if it is the case.

Is there any related log in the docker service logs or docker container inspect ?

— Reply to this email directly, view it on GitHubhttps://github.com/mohsenasm/swarm-dashboard/issues/30#issuecomment-1781222782, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAA3AI7Y5TZOY6GTLCY5PJLYBJWEDAVCNFSM6AAAAAA6KODZLSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOBRGIZDENZYGI. You are receiving this because you authored the thread.Message ID: @.***>

trajano commented 1 year ago
trajano@nas:~/trajano.net$ docker service logs -f trajano_vis
trajano_vis.1.v2qwk5ml6ouq@nas    | crond: crond (busybox 1.36.1) started, log level 8
trajano_vis.1.v2qwk5ml6ouq@nas    | HTTP server listening on 8080
trajano_vis.1.v2qwk5ml6ouq@nas    | crond: USER root pid  32 cmd run-parts /etc/periodic/15min
trajano_vis.1.v2qwk5ml6ouq@nas    | crond: USER root pid  33 cmd run-parts /etc/periodic/hourly
trajano_vis.1.v2qwk5ml6ouq@nas    | crond: USER root pid  34 cmd run-parts /etc/periodic/15min
trajano_vis.1.wtvt439q5urr@nas    | crond: crond (busybox 1.36.1) started, log level 8
trajano_vis.1.wtvt439q5urr@nas    | HTTP server listening on 8080
trajano commented 1 year ago

Building on the pi

 => [elm-build 2/8] RUN npm install --unsafe-perm -g elm@latest-0.18.0 --silent                                                                                                                  39.8s
 => => # /usr/local/bin/elm -> /usr/local/lib/node_modules/elm/bin/elm
 => => # /usr/local/bin/elm-make -> /usr/local/lib/node_modules/elm/bin/elm-make
 => => # /usr/local/bin/elm-package -> /usr/local/lib/node_modules/elm/bin/elm-package
 => => # /usr/local/bin/elm-reactor -> /usr/local/lib/node_modules/elm/bin/elm-reactor
 => => # /usr/local/bin/elm-repl -> /usr/local/lib/node_modules/elm/bin/elm-repl
 => => # No binaries are available for your platform: linux-arm64
mohsenasm commented 1 year ago

The result of this stage is only a client-side js file, which is platform-independent. So, we don't need to build it in arm and I forced the Elm build stage to only use amd64 in the Dockerfile.

If you want to build on the Raspberry Pi, you can pass this issue by copying file /home/node/app/client/index.js from one of docker images (like mohsenasm/swarm-dashboard:dev_multiarch) or download the file and copy it from local.

Another question, Does the dashboard work normally before it stops?

I think we should investigate what causes error code 143, maybe from the stopped container's inspect. According to this link, the error can also be memory-related. Maybe we can try using it without node-exporter and cadvisor and see if the issue remains.

version: "3"

services:
  swarm-dashboard:
    image: mohsenasm/swarm-dashboard:dev_multiarch
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
    ports:
      - 8080:8080
    environment:
      PORT: 8080
      ENABLE_AUTHENTICATION: "false"
      ENABLE_HTTPS: "false"
      DOCKER_UPDATE_INTERVAL: 15000
    deploy:
      placement:
        constraints:
          - node.role == manager
trajano commented 1 year ago

It won't open up the port for me so nothing is listening.

I guess Elm won't have an ARM specific build anytime soon so this block may be needed to build a platform specific elm

# syntax=docker/dockerfile:1-labs
FROM debian:bullseye AS elm
RUN --mount=type=cache,target=/var/cache/apt \
  apt-get update && \
  apt-get install -y build-essential \
    automake \
    autotools-dev \
    make \
    g++ \
    ca-certificates \
    software-properties-common \
    apt-transport-https \
    lsb-base \
    lsb-release \
    zlib1g-dev \
    libpcre3-dev \
    libcurl4-openssl-dev \
    libc-dev \
    libxml2-dev \
    libsnmp-dev \
    libssh2-1-dev \
    libevent-dev \
    libopenipmi-dev \
    libpng-dev \
    pkg-config \
    libfontconfig1 \
    git \
    bzip2 \
    zip \
    unzip \
    musl-dev \
    ghc \
    cabal-install \
    libmpfr-dev
ADD --keep-git-dir=true https://github.com/elm/compiler.git#0.19.1 /w/compiler
WORKDIR /w/compiler
RUN rm worker/elm.cabal
RUN cabal new-update
RUN cabal new-configure
RUN cabal new-build

Unfortunately the binary that this provides is 0.19.1 which apparently does not support the commands used in 0.18.0 so that won't help.

mohsenasm commented 1 year ago

Elm is no issue here. It only produces the index.js file which runs in the browser, not on the server side. So, we can build index.js in any platform (even local) and copy it into the image.

Here is the file: index.js.zip


if you want to build the image on the pi, download and unpack index.js.zip into ./elm-client/client/index.js and use this Dockerfile:

FROM node:20-alpine AS base
RUN apk add --update tini lego curl && rm -r /var/cache
ENTRYPOINT ["/sbin/tini", "--"]
WORKDIR /home/node/app

FROM base AS dependencies
ENV NODE_ENV production
COPY package.json yarn.lock ./
RUN yarn install --production

FROM base AS release
WORKDIR /home/node/app
ENV LEGO_PATH=/lego-files

COPY --from=dependencies /home/node/app/node_modules node_modules
COPY ./elm-client/client client
COPY server server
COPY server.sh server.sh
COPY healthcheck.sh healthcheck.sh
COPY crontab /var/spool/cron/crontabs/root

ENV PORT=8080
# HEALTHCHECK --interval=60s --timeout=30s \
#   CMD sh healthcheck.sh

# Run under Tini
CMD ["sh", "server.sh"]
trajano commented 1 year ago

Thanks I got it to build and deploy

curl -v https://trajano.net/swarm-visualizer/

~~Missing metrics though #31 ~~ with metrics!

mohsenasm commented 1 year ago

Is it working now? What was the problem/sloution?

trajano commented 1 year ago

Is it working now? What was the problem/sloution?

From the documentation I presumed these were defaulted

  NODE_EXPORTER_SERVICE_NAME_REGEX: "node-exporter"
  CADVISOR_SERVICE_NAME_REGEX: "cadvisor"

But it seems I had to explicity set it in the environment.

trajano commented 1 year ago

Note this is a personal build so it's not ideal.

mohsenasm commented 1 year ago

Note this is a personal build so it's not ideal.

So you only rebuilt the image with the Dockerfile that I posted, and it worked?

No change in the compose.yml file or anything else?


From the documentation I presumed these were defaulted

  NODE_EXPORTER_SERVICE_NAME_REGEX: "node-exporter"
  CADVISOR_SERVICE_NAME_REGEX: "cadvisor"

The default is not using cadvisor and node-exporter, because of backward compatibility.

trajano commented 1 year ago

Slight modification so I can test both locally and on the server. But same result

FROM node:20-alpine AS base
RUN apk add --update tini lego curl && rm -r /var/cache
ENTRYPOINT ["/sbin/tini", "--"]
WORKDIR /home/node/app

FROM base AS dependencies
ENV NODE_ENV production
COPY package.json yarn.lock ./
RUN yarn install --production

FROM node:20-alpine AS base2
WORKDIR /w/
ADD https://github.com/mohsenasm/swarm-dashboard/files/13181678/index.js.zip /w
RUN unzip index.js.zip

FROM base AS release
WORKDIR /home/node/app
ENV LEGO_PATH=/lego-files

COPY --from=dependencies /home/node/app/node_modules node_modules
COPY --from=base2 /w/index.js client/index.js
COPY ./elm-client/client client
COPY server server
COPY server.sh server.sh
COPY healthcheck.sh healthcheck.sh
COPY crontab /var/spool/cron/crontabs/root

ENV PORT=8080
# HEALTHCHECK --interval=60s --timeout=30s \
#   CMD sh healthcheck.sh

# Run under Tini
CMD ["sh", "server.sh"]

My relevant compose file blocks (I added to my uber compose file)

  vis:
    # image: mohsenasm/swarm-dashboard:dev_multiarch
    image: trajano/sasha
    environment:
      PATH_PREFIX: /swarm-visualizer
      NODE_EXPORTER_SERVICE_NAME_REGEX: "node-exporter"
      CADVISOR_SERVICE_NAME_REGEX: "cadvisor"
    networks:
      - traefik
      - default
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
    # logging:
    #   driver: "none"
    deploy:
      resources:
        reservations:
          memory: 128M
        limits:
          memory: 128M
      placement:
        constraints:
          - "node.role==manager"
      labels:
        - traefik.enable=true
        - traefik.http.routers.swarm-visualizer.rule=PathPrefix(`/swarm-visualizer`)
        - traefik.http.routers.swarm-visualizer.entryPoints=https
        - traefik.http.services.swarm-visualizer.loadbalancer.server.port=8080
  node-exporter:
    image: quay.io/prometheus/node-exporter:v1.6.1
    volumes:
      - '/:/host:ro'
    command:
      - '--path.rootfs=/host'
    deploy:
      mode: global
      resources:
        reservations:
          memory: 64M
        limits:
          memory: 64M

  cadvisor:
    image: gcr.io/cadvisor/cadvisor:v0.47.2
    volumes:
      - /:/rootfs:ro
      - /var/run:/var/run:ro
      - /sys:/sys:ro
      - /var/lib/docker:/var/lib/docker:ro
      - /dev/disk:/dev/disk:ro
      - /dev/kmsg:/dev/kmsg:ro
    cap_add:
      - SYSLOG
    privileged: true
    deploy:
      mode: global
      resources:
        reservations:
          memory: 128M
        limits:
          memory: 128M
trajano commented 1 year ago

I get some weird phantom CPU

disk: 51% | cpu: 111% | mem: 23%

Not sure how CPU is computed so it goes past 100% :)

mohsenasm commented 1 year ago

For example if you have two CPU cores, one 50% and another 61%, you will get 111% CPU usage.

trajano commented 1 year ago

So the CPUs are summed up rather than averaged to a single number?

mohsenasm commented 1 year ago

Yes, like how docker stats works.

trajano commented 1 year ago

Well back to the topic, not sure how to get a proper arm build of the image still. I don't think creating our own image and deploying it makes sense :D

mohsenasm commented 1 year ago

Exactly! I used platforms feature of the docker/build-push-action github workflow to build and push the image. It's a pretty common way!

trajano commented 1 year ago

Just redownloaded it works now. I think I was still using the one with healthcheck when I last deployed.

    "Id": "sha256:a0fd0c08b1f8da3bbeb9887a5760596067a3983a302c44821f5f47c2169e1cfa",