snipe / snipe-it

A free open source IT asset/license management system
https://snipeitapp.com
GNU Affero General Public License v3.0
10.4k stars 3.06k forks source link

Problem with "supervisor: couldn't setuid to 1000: Could not set groups of effective user" when run on Openshift #14634

Open Veinar opened 2 months ago

Veinar commented 2 months ago

Debug mode

Describe the bug

Hello,

I'm trying to implement snipe-it on OKD, I don't really know what's wrong, it looks like it wants to be run as root, so I'm giving it a go. When it is not run as root, it leads to a permissions error when creating directories. But it performs some operations in the startup script that are not understandable (to me), involving setting groups by supervisor, which leads to an error during spawning child processes. Output logs from the container:

Module ssl disabled.
To activate the new configuration, you need to run:
  service apache2 restart
Changing upload limit to 100
Nothing to migrate.
Configuration cache cleared!
Configuration cache cleared!
Configuration cached successfully!
2024-04-19 08:42:37,416 CRIT Supervisor is running as root.  Privileges were not dropped because no user is specified in the config file.  If you intend to run as root, you can set user=root in the config file to avoid this message.
2024-04-19 08:42:37,421 INFO supervisord started with pid 1
2024-04-19 08:42:38,425 INFO spawned: 'exit_on_any_fatal' with pid 69
2024-04-19 08:42:38,429 INFO spawned: 'apache' with pid 70
2024-04-19 08:42:38,432 INFO spawned: 'run_schedule' with pid 71
supervisor: couldn't setuid to 1000: Could not set groups of effective user
supervisor: child process was not spawned
2024-04-19 08:42:38,436 INFO exited: run_schedule (exit status 127; not expected)
AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 10.129.3.235. Set the 'ServerName' directive globally to suppress this message
2024-04-19 08:42:39,602 INFO success: exit_on_any_fatal entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2024-04-19 08:42:39,602 INFO success: apache entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2024-04-19 08:42:39,606 INFO spawned: 'run_schedule' with pid 79
supervisor: couldn't setuid to 1000: Could not set groups of effective user
supervisor: child process was not spawned
2024-04-19 08:42:39,613 INFO exited: run_schedule (exit status 127; not expected)
2024-04-19 08:42:41,620 INFO spawned: 'run_schedule' with pid 80
supervisor: couldn't setuid to 1000: Could not set groups of effective user
supervisor: child process was not spawned
2024-04-19 08:42:41,629 INFO exited: run_schedule (exit status 127; not expected)
2024-04-19 08:42:44,635 INFO spawned: 'run_schedule' with pid 81
supervisor: couldn't setuid to 1000: Could not set groups of effective user
supervisor: child process was not spawned
2024-04-19 08:42:44,641 INFO exited: run_schedule (exit status 127; not expected)
2024-04-19 08:42:45,642 INFO gave up: run_schedule entered FATAL state, too many start retries too quickly
2024-04-19 08:42:46,649 WARN received SIGTERM indicating exit request
2024-04-19 08:42:46,649 INFO waiting for exit_on_any_fatal, apache to die
2024-04-19 08:42:47,653 INFO stopped: apache (terminated by SIGTERM)
2024-04-19 08:42:47,661 INFO stopped: exit_on_any_fatal (terminated by SIGTERM)

What are the required uid and gid for the container? Are changes to uid and gid necessary. Or should I provide a specific configuration to allow supervisor to run as root.

I am willing to help solve this problem due to the availability of the environment and some knowledge of deployment on k8s like systems.

Thanks in advance, Regards, Konrad

Reproduction steps

  1. Ensure that DB is available (connectivity, authentication)
  2. Deploy on k8s like system using YAML:
    ---
    # Snipe it Deployment
    apiVersion: apps/v1
    kind: Deployment
    metadata:
    name: snipeit
    namespace: it-asset-management
    spec:
    replicas: 1
    selector:
    matchLabels:
      app: snipeit
    template:
    metadata:
      labels:
        app: snipeit
    spec:
      serviceAccountName: itasset-admin  
      securityContext:
        runAsUser: 0  
        runAsGroup: 0
      containers:
        - name: snipeit
          image: snipe/snipe-it:v6-latest
          env:
            - name: MYSQL_PORT_3306_TCP_ADDR
              value: <<your_db_ip_or_dns>>
          envFrom:
            - secretRef:
                name: snipeit-secrets
            - configMapRef:
                name: snipeit-config
          ports:
            - containerPort: 9000
          resources:
            limits:
              cpu: 1
              memory: 4Gi
              ephemeral-storage: "10Gi"
            requests:
              cpu: 500m
              memory: 2Gi
      automountServiceAccountToken: false
    ---
    apiVersion: v1
    kind: Secret
    metadata:
    name: snipeit-secrets
    namespace: it-asset-management
    type: Opaque
    data:
    MYSQL_PORT_3306_TCP_PORT: MzMwNg==          # Base64 encoded value of 3306
    MYSQL_DATABASE: c25pcGVpdA==                # Base64 encoded value of snipeit
    MYSQL_USER: c25pcGVpdA==                    # Base64 encoded value of snipeit
    MYSQL_PASSWORD: c3VwZXItc2VjcmV0LXBhc3N3b3Jk  # Base64 encoded value of super-secret-password
    MAIL_ENV_PASSWORD: mV0LXBhc3N3b3Jk                   # Base64 encoded randomized value
    ---
    apiVersion: v1
    kind: ConfigMap
    metadata:
    name: snipeit-config
    namespace: it-asset-management
    data:
    MAIL_PORT_587_TCP_ADDR: smtp.office365.com
    MAIL_PORT_587_TCP_PORT: "587"
    MAIL_ENV_FROM_ADDR: email@domain.com
    MAIL_ENV_FROM_NAME: Admins
    MAIL_ENV_ENCRYPTION: tls
    MAIL_ENV_USERNAME: email@domain.com
    APP_ENV: production
    APP_DEBUG: "false"
    APP_KEY: "<<CONCEALED>>"
    APP_URL: "it-assets.domain.com"
    APP_TIMEZONE: Europe/Warsaw
    APP_LOCALE: en
    PHP_UPLOAD_LIMIT: "100"
    ---
    apiVersion: v1
    kind: Service
    metadata:
    name: snipeit
    namespace: it-asset-management
    spec:
    selector:
    app: snipeit
    ports:
    - protocol: TCP
      port: 9000
    ---
    apiVersion: networking.k8s.io/v1
    kind: Ingress
    metadata:
    name: snipeit-ingress
    namespace: it-asset-management
    spec:
    rules:
    - host: it-assets.domain.com
    http:
      paths:
      - pathType: Prefix
        path: "/"
        backend:
          service:
            name: snipeit
            port:
              number: 9000

Expected behavior

Start and stabilize snipe-it on OKD.

Screenshots

No response

Snipe-IT Version

latest

Operating System

Openshift

Web Server

n/a

PHP Version

n/a

Operating System

No response

Browser

No response

Version

No response

Device

No response

Operating System

No response

Browser

No response

Version

No response

Error messages

No response

Additional context

It's fresh deployment of application, database (same namespace mysql database) population was successful but we were unable to achieve stability due to startup problem.

welcome[bot] commented 2 months ago

👋 Thanks for opening your first issue here! If you're reporting a 🐞 bug, please make sure you include steps to reproduce it. We get a lot of issues on this repo, so please be patient and we will get back to you as soon as we can.

snipe commented 2 months ago

We don't use OKD here, so I honestly don't know. This looks like more of a server config issue than a Snipe-IT issue though. Google isn't offering much up though.

Veinar commented 2 months ago

Hi @snipe thanks for your quick reply, I will try to solve this problem on my own. And yes, I agree that this is a problem related to the specific OKD/Openshift environment.

My focus is on this particular log output:

supervisor: couldn't setuid to 1000: Unable to set effective user groups
supervisor: child process was not created

OKD does not allow you to easily change the uid or gid when the container is spawned. Looking through the Dockerfile and the startup.sh script on the latest image, I found at the very end of this startup.sh script is a directive:

exec supervisord -c /supervisord.conf

Which, when executed on an "idle" container (with CMD overwritten to sleep), generates the same error as OKD/Openshift.

So basically I confirm that this is not a problem with the Snipe-IT software, but with the supervisord dependency, which does not work properly on systems similar to OKD/Openshift.

I would like to keep this thread for future discoveries by various people, but I don't see an option to convert it to Discussion. So yes, you can close it, as this is not a problem for this particular software. I will continue working on getting this up and running and will contact the guys from supervizord team. I would like to comment in the future on how my research on this topic went.

Regards, Konrad 😃

Veinar commented 2 months ago

Okay, one more step forward, I was able to get response from Snipe-IT server that is hosted on OKD/Openshift. With APP_DEBUG option enabled I got some errors and maybe someone can guide me where to search for potential problems next. So far in supervisord.conf I had to change user declarations in two places:

  1. In [supervisord] section I've added additional line user=root,
  2. In [program:run_schedule] section I've changed user=docker to user=root.

All this just to get some certificates error 🥲

RuntimeException: Unsupported cipher or incorrect key length. Supported ciphers are: aes-128-cbc, aes-256-cbc, aes-128-gcm, aes-256-gcm. in /var/www/html/vendor/laravel/framework/src/Illuminate/Encryption/Encrypter.php:55

Whole log with stack trace was uploaded to pastebin

If anyone is willing to help I would be glad for any hints, And yes I know that Openshift is not supported platform.

Regards, Konrad 👋

snipe commented 2 months ago

What happens when you try to run php artisan key:generate?

Veinar commented 2 months ago

Hi @snipe,

I've executed command echo yes | php artisan key:generate at the beginning but same result as before. Logs output (truncated):

**************************************
* Application In Production! *
**************************************
Do you really wish to run this command? (yes/no) [no]:
>
Application key set successfully.
Module ssl disabled.
To activate the new configuration, you need to run:
service apache2 restart
...

Does setting ENV to production may cause this issue ?

Regards, Konrad 👋