FORTH-ICS-INSPIRE / artemis

ARTEMIS: Real-Time Detection and Automatic Mitigation for BGP Prefix Hijacking. This is the main ARTEMIS repository that composes artemis-frontend, artemis-backend, artemis-monitor and other needed containers.
BSD 3-Clause "New" or "Revised" License
306 stars 44 forks source link

Running on k8s with tight security results in configuration container crash #670

Closed QynKpGJdh closed 1 year ago

QynKpGJdh commented 1 year ago

Describe the bug I have been able to deploy artemis on a k8s cluster that will not allow for root containers at all. This was done by going through each container Dockerfile and setting a non non-root user to run as. Also, modification of some of the templates had to be done so that applications will start correctly. However, I have stumbled across a problem where when I paste an existing configuration in to Admin -> System -> Current Configuration and press Save, the configuration container will crash with the following error.

./entrypoint: line 19:   155 Killed                  /usr/local/bin/python -c "import configuration; configuration.main()"

Affected Component(s)

To Reproduce Steps to reproduce the behavior:

  1. Modify all Dockerfiles to run as a certain user.

    RUN addgroup -g 15221 appuser
    RUN adduser -S -u 15221 -G appuser appuser
    USER appuser:appuser
  2. Make necessary security changes in the templates yaml files

    ...
    spec:
    template:
    spec:
      securityContext:
        runAsUser: 15221
        runAsGroup: 15221
        fsGroup: 15221
      containers:
      - name: graphql
        image: {{ .image }}
        securityContext:  
          allowPrivilegeEscalation: false
          capabilities:
            drop: 
              - ALL
          runAsNonRoot: true
          seccompProfile:
            type: RuntimeDefault  
  3. Install the helm chart and wait for the pods to be in running state

    helm install artemis -f values.yaml -f example-values.yaml .
  4. Log in with the default credentials to the Artemis frontend. Go to Admin -> System -> Current Configuration and paste any configuration (for instance the default one) and watch nothing happen. Also, pushing any buttons (BGPStream Historical Monitor, BGPStream Kafka Monitor, BGPStream Live Monitor, ExaBGP Monitor, Mitigation or RIPE RIS Monitor) should turn green, but nothing will happen.

  5. View logs from container configuration-xxx-yyy

    ./entrypoint: line 19:   155 Killed                  /usr/local/bin/python -c "import configuration; configuration.main()"

Expected behavior A confirmation message with green background saying: Configuration file updated should appear when a new configuration is saved. Button presses should respond by turning green.

System (please complete the following information): Client Version: version.Info{Major:"1", Minor:"25", GitVersion:"v1.25.4", GitCommit:"872a965c6c6526caa949f0c6ac028ef7aff3fb78", GitTreeState:"clean", BuildDate:"2022-11-09T13:36:36Z", GoVersion:"go1.19.3", Compiler:"gc", Platform:"darwin/amd64"} Kustomize Version: v4.5.7 Server Version: version.Info{Major:"1", Minor:"26", GitVersion:"v1.26.0", GitCommit:"b46a3f887ca979b1a5d14fd39cb1af43e7e5d12d", GitTreeState:"clean", BuildDate:"2022-12-08T19:51:45Z", GoVersion:"go1.19.4", Compiler:"gc", Platform:"linux/amd64"}

QynKpGJdh commented 1 year ago

Upon closer inspection, I noticed that the configuration container was OOMKilled when the configuration was applied -> increased resources for the pod and it stopped crashing. However, postgres container was crashing all the time with FATAL: sorry, too many clients already. Mounted a modified postgresql.conf at /var/lib/postgresql/data/pgdata/ with max_connections set to 50 and now the deployment seems to function correctly.

Will do some further resilience testing with the deployment.

QynKpGJdh commented 1 year ago

Deployment is stable with the above modifications. I will close this.