cannot deploy on OpenShift 4.x - no logs to tell why

jmazzitelli commented 5 years ago

I have a CRC VM running OpenShift 4.1.6. I create a namespace "ldap" and then use a modified example kubernetes yaml to create an LDAP deployment. I say modified because I can't use hostPath (not allowed when deployed in this CRC openshift cluster), so I just changed the volumes to mount to an emptyDir:

       volumes:
        - name: ldap-data
          emptyDir: {}
        - name: ldap-config
          emptyDir: {}
        - name: ldap-certs
          emptyDir: {}

I assume empty directories are ok, and that this should just start with all defaults and no initial data in the LDAP directory.

Everything else is the same as the current example yaml.

When I run oc create -f ldap-deployment.yaml -n ldap, the pod tries to start but fails. But the problem is I have no way of knowing why. I see the pod status of "CrashLoopBackOff". When I look at the logs, all I see are two lines:

*** CONTAINER_LOG_LEVEL = 3 (info)
*** Killing all processes...

If I edit the Deployment such that the env var LDAP_LOG_LEVEL has a value of "-1" (which should enable all debugging according to Section 5.2.1.2 here: https://www.openldap.org/doc/admin24/slapdconf2.html) I still only see those 2 lines.

So, in short, trying to install on OpenShift is failing and I've no idea why. 2 questions:

How can I enable debug messages?
How can I deploy on OpenShift 4?

obourdon commented 5 years ago

@jmazzitelli my take is that you need to add a line like args: ["--loglevel", "debug"] to your YAML deployment file like explained here or here

jmazzitelli commented 5 years ago

you need to add a line like args: ["--loglevel", "debug"]

Well, that added one more line in the output :)

*** CONTAINER_LOG_LEVEL = 4 (debug)
*** Run commands before finish...
*** Killing all processes...

UPDATE 1: Ahh... if I give a bad command line argument, I am given the usage syntax - I see "trace" is another level. I will try that:

usage: run [-h] [-e] [-s] [-p] [-f] [-o {startup,process,finish}]
           [-c COMMAND [WHEN={startup,process,finish} ...]] [-k]
           [--wait-state FILENAME] [--wait-first-startup] [--keep-startup-env]
           [--copy-service] [--dont-touch-etc-hosts] [--keepalive]
           [--keepalive-force] [-l {none,error,warning,info,debug,trace}]
           [MAIN_COMMAND [MAIN_COMMAND ...]]
run: error: argument -l/--loglevel: invalid choice: '-1' (choose from 'none', 'error', 'warning', 'info', 'debug', 'trace')

UPDATE 2: That doesn't help track down the problem either:

*** CONTAINER_LOG_LEVEL = 5 (trace)
*** Run commands before finish...
*** Killing all processes...

I'm beginning to wonder if this example deployment.yaml is even correct. It doesn't have any arguments - so what is this supposed to run?

obourdon commented 5 years ago

@jmazzitelli would you like sharing your YAML file here please ? What is strange is that because of the lines *** CONTAINER_LOG_LEVEL = 5 (trace) and *** CONTAINER_LOG_LEVEL = 4 (debug) we can see that the arguments are taken appropriately however because of the error seems like the YAML might be garbled (where does this -1 comes from ???) By default the entry point of the container is run

$ docker inspect osixia/openldap | jq -r '.[0].Config.Entrypoint[0]'
/container/tool/run

obourdon commented 5 years ago

@jmazzitelli did you also look at the pod logs ? You are only referring to the output here. I guess this should be something like:

oc logs <pod-name/ID>

jmazzitelli commented 5 years ago

@obourdon

(where does this -1 comes from ???)

I just set my arg to be args: ["--loglevel", "-1"] to see what it would be. That's when the container just failed with the usage. To be expected, but it gave me that usage message which told me I could do this: args: ["--loglevel", "trace"]

Anyway, as the original issue description explains, the yaml is basically the same as the current example yaml in this repo except I mount all the volumes to emptyDir: {} rather than a host path. But I will attach the full yaml here for completeness (ignore the .log extension - github won't let me attach a .yaml file so I just appended ".log" to the end of it).

ldap-deployment.yaml.log

I install this very simply:

oc create ns ldap; oc create -f ldap-deployment.yaml -n ldap

Here's some details:

$ oc get pods -n ldap
NAME                    READY   STATUS             RESTARTS   AGE
ldap-794b957cb7-xwknn   0/1     CrashLoopBackOff   5          3m44s

$ oc logs ldap-794b957cb7-xwknn -n ldap
*** CONTAINER_LOG_LEVEL = 3 (info)
*** Killing all processes...

You can see the logs are very small - just two lines. If I add that arg of "--loglevel=debug" I just get that third line I showed earlier. It just seems like nothing is running.

jmazzitelli commented 5 years ago

@obourdon here's my pod yaml after being deployed:

oc get pod ldap-794b957cb7-xwknn -n ldap -o yaml > /tmp/pod.yaml

results in this: pod.yaml.log

Notice the container status - docker.io/osixia/openldap:1.2.4 finished with an exit code of 0:

status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2019-08-06T14:03:18Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2019-08-06T14:09:13Z"
    message: 'containers with unready status: [ldap]'
    reason: ContainersNotReady
    status: "False"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2019-08-06T14:09:13Z"
    message: 'containers with unready status: [ldap]'
    reason: ContainersNotReady
    status: "False"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2019-08-06T14:03:18Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - containerID: cri-o://7369868d17db7b058ac34c1ade2898000364515a9f4eb13f12d9d023a432b9a5
    image: docker.io/osixia/openldap:1.2.4
    imageID: docker.io/osixia/openldap@sha256:d212a12aa728ccb4baf06fcc83dc77392d90018d13c9b40717cf455e09aeeef3
    lastState:
      terminated:
        containerID: cri-o://7369868d17db7b058ac34c1ade2898000364515a9f4eb13f12d9d023a432b9a5
        exitCode: 0
        finishedAt: "2019-08-06T14:14:19Z"
        reason: Completed
        startedAt: "2019-08-06T14:14:19Z"
    name: ldap
    ready: false
    restartCount: 7
    state:
      waiting:
        message: Back-off 5m0s restarting failed container=ldap pod=ldap-794b957cb7-xwknn_ldap(f124f4ff-b852-11e9-835c-52fdfc072182)
        reason: CrashLoopBackOff
  hostIP: 192.168.130.11
  phase: Running
  podIP: 10.128.0.200
  qosClass: BestEffort
  startTime: "2019-08-06T14:03:18Z"

obourdon commented 5 years ago

@jmazzitelli as you can see in your previous post: *** CONTAINER_LOG_LEVEL = 3 (info) this is the default behaviour without much traces.

There is no = between --loglevel and debug these are 2 separate arguments and therefore the list: args: ["--loglevel", "trace"] which should be added to your deployment YAML (or replace trace by debug for a 1st less verbose trial)

Sorry I do not have access to a OpenShift platform but this work perfectly well on my local K8s

jmazzitelli commented 5 years ago

Yes, I did turn on trace before. That was my "UPDATE 2" of comment https://github.com/osixia/docker-openldap/issues/346#issuecomment-518676500.

Here it is again when I enable trace via: args: ["--loglevel", "trace"]

$ oc logs ldap-d8b746657-jtks8 
*** CONTAINER_LOG_LEVEL = 5 (trace)
*** Run commands before finish...
*** Killing all processes...

$ oc get pods
NAME                   READY   STATUS             RESTARTS   AGE
ldap-d8b746657-jtks8   0/1     CrashLoopBackOff   4          2m39s

jmazzitelli commented 5 years ago

@BertrandGouny any ideas what could be the problem here?

mq2195 commented 5 years ago

I am having exactly the same error trying podman. Container gets created with status Exited

sudo podman run -d -u 1000 -p 192.168.122.1:389:389 -p 192.168.122.1:636:636 --name openldap --restart=always -h example.net -e LDAP_ORGANISATION="EXAMPLE" -e LDAP_DOMAIN="example.net" -e LDAP_ADMIN_PASSWORD="MZGYJgLOBYyVl3bi6CML8wUJtXicxUuQ" -e LDAP_CONFIG_PASSWORD="Rz1crAC3c63qomK8XJUCoW1zSlYVpmIq" -e LDAP_TLS_CRT_FILENAME=ldap.example.net.crt -e LDAP_TLS_KEY_FILENAME=ldap.example.net.key -e LDAP_TLS_CA_CRT_FILENAME=ca.crt -v /home/zzz/openldap/db:/var/lib/ldap -v /home/zzz/openldap/config:/etc/ldap/slapd.d -v /home/zzz/openldap:/container/service/slapd/assets/certs osixia/openldap:1.2.4 --loglevel info

uid 1000 has access to volumes error:

sudo podman logs openldap
*** CONTAINER_LOG_LEVEL = 3 (info)
*** Killing all processes...

when permissions were not set correctly (container did not have access to volumes):

sudo podman logs openldap                                                                                                                                                                                                   
*** CONTAINER_LOG_LEVEL = 3 (info)                                                                                                                                                                                                            
*** Search service in CONTAINER_SERVICE_DIR = /container/service :                                                                                                                                                                            
*** link /container/service/:ssl-tools/startup.sh to /container/run/startup/:ssl-tools                                                                                                                                                        
*** link /container/service/slapd/startup.sh to /container/run/startup/slapd                                                                                                                                                                  
*** link /container/service/slapd/process.sh to /container/run/process/slapd/run                                                                                                                                                              
*** Set environment for startup files                                                                                                                                                                                                         
*** Environment files will be proccessed in this order :                                                                                                                                                                                      
Caution: previously defined variables will not be overriden.                                                                                                                                                                                  
/container/environment/99-default/default.startup.yaml                                                                                                                                                                                        
/container/environment/99-default/default.yaml                                                                         

To see how this files are processed and environment variables values,                                                                                                                                                                         
run this container with '--loglevel debug'                                                                                                                                                                                                    
*** Running /container/run/startup/:ssl-tools...                                                                                                                                                                                              
*** Running /container/run/startup/slapd...                                                                                                                                                                                                   
chown: cannot read directory '/var/lib/ldap': Permission denied                                                                                                                                                                               
*** /container/run/startup/slapd failed with status 1                                                                                                                                                                                         

*** Killing all processes...

I just installed podman to evaluate how easy it is going to be to migrate out of docker... it looks like I will have to read some friendly podman manuals...

jmazzitelli commented 5 years ago

I gave up - I ended up using https://github.com/openshift/openldap instead.

mq2195 commented 5 years ago

So, in my case it was SElinux after all... so, permissions... and running it as non-root (big selling point of podman)

volumes needs to end with :Z like -v /home/zzz/openldap/db:/var/lib/ldap:Z- this is for SElinux

working command line with podman that I used:

sudo podman run -d -p 192.168.122.1:389:389 -p 192.168.122.1:636:636 --name openldap -h ldap.example.net -e LDAP_ORGANISATION="EXAMPLE" -e LDAP_DOMAIN="example.net" -e LDAP_ADMIN_PASSWORD="MZGYJgLOBYyVl3bi6CML8wUJtXicxUuQ" -e LDAP_CONFIG_PASSWORD="Rz1crAC3c63qomK8XJUCoW1zSlYVpmIq" -e LDAP_TLS_CRT_FILENAME=ldap.example.net.crt -e LDAP_TLS_KEY_FILENAME=ldap.example.net.key -e LDAP_TLS_CA_CRT_FILENAME=ca.crt -v /home/zzz/openldap/db:/var/lib/ldap:Z -v /home/zzz/openldap/config:/etc/ldap/slapd.d:Z -v /home/zzz/openldap:/container/service/slapd/assets/certs:Z osixia/openldap:1.2.4 --loglevel debug

Directory permissions I used for volumes:

sudo ls -altr /home/zzz/openldap/                                                                                                                                                                                           
total 32                                                                                                                                                                                                                                      
drwxr-xr-x. 3 root root 4096 Aug 14 18:58 ..                                                                                                                                                                                                  
drwx------. 2 root root 4096 Aug 14 19:00 db                                                                                                                                                                                                  
drwx------. 2 root root 4096 Aug 14 19:00 config                                                                                                                                                                                              
-rw-------. 1 root root 3272 Aug 14 19:01 ldap.example.net.key                                                                                                                                                                                
-rw-r--r--. 1 root root 7675 Aug 14 19:01 ldap.example.net.crt                                                                                                                                                                                
-rw-r-----. 1 root root 4059 Aug 14 19:01 ca.crt                                                                                                                                                                                              
drwx------. 4 root root 4096 Aug 14 21:45 .

However I was unable to make it work as non root (-u 1000), maybe it has something to do with User being not defined in the container...

sudo podman inspect docker.io/osixia/openldap:1.2.4|grep User
        "User": "",

or SElinux policies:

podman run -d -u 1000 --name openldap2 osixia/openldap:1.2.4

podman logs openldap2
*** CONTAINER_LOG_LEVEL = 3 (info)
*** Killing all processes...

id
uid=1000(zzz) gid=1000(zzz) groups=1000(zzz),10(wheel),974(libvirt) context=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023

or maybe I just need read the friendly manual... Bottom line for me: it is working with podman

jmazzitelli commented 5 years ago

My use case is different - this issue was running LDAP inside an OpenShift cluster - not within just docker or podman. In that case, permissions on the volumes should not be an issue. There is something else going on that I could not figure out while running in OpenShift.

brightzheng100 commented 4 years ago

Please refer to https://github.com/helm/charts/issues/16098

rcfja commented 2 years ago

I've just hit this issue also. Guess I'll use the openshift ldap since this is not solved.

joshua-bj commented 1 year ago

I hit exactly this issue at OCP, it can be solved by grant the openldap pod's service account with anyuid SCC. I guess the arbitrary UID at OCP break this container.

The command is:

oc adm policy add-scc-to-user anyuid -z <service-account-name-for-openldap>

Eliav2 commented 1 year ago

but we don't have admin permissions on our cluster(and this is completely no obvious to ask one). someone succeeded with running this as non-root user?

osixia / docker-openldap

cannot deploy on OpenShift 4.x - no logs to tell why #346