projectodd / openwhisk-openshift

Resources necessary for running OpenWhisk on OpenShift
Apache License 2.0
45 stars 26 forks source link

Installation description for OpenShift origin & Install failure due to nginx permissions #22

Closed marziman closed 6 years ago

marziman commented 6 years ago

Hi all,

I failed to install openwhisk on openshift 3.7.1 and received directly this error from nginx pod:

nginx: [alert] could not open error log file: open() "/var/opt/rh/rh-nginx112/log/nginx/error.log" failed (13: Permission denied) | 2018/03/27 16:56:28 [emerg] 1#0: mkdir() "/var/opt/rh/rh-nginx112/lib/nginx/tmp/client_body" failed (13: Permission denied)

I think this is related to openshift not allowing docker images with root user. But openwhisk is based on the nginx image which needs root. So I run the same command like the minishift addon does:

oc adm policy add-scc-to-group anyuid system:authenticated

But without success. It would be good to have an openshift installation readme, not focusing on only local dev on minishift. SInce in bigger deployments you wont find minishift.

Thx & BR Mehmet

jcrossley3 commented 6 years ago

Thanks for the report! I'm hoping @goldmann can explain this.

goldmann commented 6 years ago

I'll take a look at this.

goldmann commented 6 years ago

I cannot really reproduce it. The image we use is prepared to run in OpenShift environment. You can see the source here: https://github.com/sclorg/nginx-container/blob/13271ab171745e3960ed64745a5aaa0a31d96048/1.12/Dockerfile.

The specific options that talk about the user configuration can be found here: https://github.com/sclorg/nginx-container/blob/13271ab171745e3960ed64745a5aaa0a31d96048/1.12/Dockerfile#L57-L74

The nginx container correctly uses a random user (id 1000080000):

$ oc exec nginx-1053727315-xhh29 ps aux
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
1000080+     1  0.0  0.1  44104  6188 ?        Ss   09:57   0:00 nginx: master process nginx -c /tmp/nginx.conf -g daemon off;
1000080+    14  0.0  0.0  45908  3540 ?        S    09:57   0:00 nginx: worker process
1000080+    32  0.0  0.0  47460  3084 ?        Rs   10:03   0:00 ps aux

The access log file is stored in /logs directory as configured in the template:

$ oc rsh nginx-1053727315-xhh29 ls -hall /logs                             
total 84K
drwxrwsrwx  2 root       1000080000 4.0K Mar 23 15:40 .
drwxr-xr-x 49 root       root       4.0K Mar 28 09:57 ..
-rw-r--r--  1 1000080000 1000080000  70K Mar 28 09:59 nginx_access.log

You mentioned some permission issues in the error log file, but in my case I can see it created and being owned by the correct user in the path you mention:

$ oc rsh nginx-1053727315-xhh29 ls -hall /var/opt/rh/rh-nginx112/log/nginx/
total 8.0K
drwxrwxrwx 2 default    root 4.0K Mar 28 09:57 .
drwxrwxrwx 5 default    root 4.0K Mar 28 09:57 ..
-rw-r--r-- 1 1000080000 root    0 Mar 28 09:57 error.log

You say that you cannot install OpenWhisk, can you let use know what exactly steps you performed to install it? I guess we need more information to be able to help.

I don't think we have any documentation how to run it in clustered OpenShift environment, but maybe @jcrossley3 or @bbrowning know more about it. This request should be opened as a RFE separately.

jcrossley3 commented 6 years ago

If you can't give us steps to reproduce this, can you make it happen and then provide us the output to oc describe pod nginx-xxx and any other logs you feel are relevant?

marziman commented 6 years ago

Hi,

here are the infos of oc decribe pod nginx-xxx:

oc describe pod nginx-1053727315-7rx2m
Name:       nginx-1053727315-7rx2m
Namespace:  faas
Node:       master/192.168.164.3
Start Time: Wed, 28 Mar 2018 12:30:48 +0000
Labels:     name=nginx
        pod-template-hash=1053727315
Annotations:    kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"faas","name":"nginx-1053727315","uid":"d835090a-3283-11e8-a3a0-0007cb0b3029","api...
        openshift.io/scc=anyuid
Status:     Running
IP:     10.1xx.x.xx
Created By: ReplicaSet/nginx-1053727315
Controlled By:  ReplicaSet/nginx-1053727315
Init Containers:
  wait-for-controller:
    Container ID:   docker://901c291ca4f6f661d3f63394720abd68e5a59dec0fd94414e4fd89847649aa37
    Image:      busybox
    Image ID:       docker-pullable://docker.io/busybox@sha256:2107a35b58593c58ec5f4e8f2c4a70d195321078aebfadfbfb223a2ff4a4ed21
    Port:       <none>
    Command:
      sh
      -c
      until wget -T 5 --spider http://controller:8080/ping; do echo waiting for controller; sleep 2; done;
    State:      Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Wed, 28 Mar 2018 12:31:00 +0000
      Finished:     Wed, 28 Mar 2018 12:32:43 +0000
    Ready:      True
    Restart Count:  0
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-xxx (ro)
Containers:
  nginx:
    Container ID:   docker://2c1805d398bf146f8293442ff640282f0bb022e947b488446c25984380025971
    Image:      centos/nginx-112-centos7@sha256:42330f7f29ba1ad67819f4ff3ae2472f62de13a827a74736a5098728462212e7
    Image ID:       docker-pullable://docker.io/centos/nginx-112-centos7@sha256:42330f7f29ba1ad67819f4ff3ae2472f62de13a827a74736a5098728462212e7
    Ports:      8080/TCP, 8443/TCP
    Command:
      /bin/bash
      -o
      allexport
      /nginx_config/init
    State:      Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    1
      Started:      Wed, 28 Mar 2018 12:35:46 +0000
      Finished:     Wed, 28 Mar 2018 12:35:46 +0000
    Ready:      False
    Restart Count:  5
    Environment:
      KUBERNETES_NAMESPACE: faas (v1:metadata.namespace)
    Mounts:
      /logs from logs (rw)
      /nginx_config from nginx-conf (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-xxx (ro)
Conditions:
  Type      Status
  Initialized   True 
  Ready     False 
  PodScheduled  True 
Volumes:
  nginx-conf:
    Type:   ConfigMap (a volume populated by a ConfigMap)
    Name:   nginx
    Optional:   false
  logs:
    Type:   EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium: 
  default-token-xxx:
    Type:   Secret (a volume populated by a Secret)
    SecretName: default-token-xxx
    Optional:   false
QoS Class:  BestEffort
Node-Selectors: <none>
Tolerations:    <none>
Events:
  FirstSeen LastSeen    Count   From            SubObjectPath                   Type        Reason          Message
  --------- --------    -----   ----            -------------                   --------    ------          -------
  5m        5m      1   default-scheduler                           Normal      Scheduled       Successfully assigned nginx-1053727315-7rx2m to master
  5m        5m      1   kubelet, master                         Normal      SuccessfulMountVolume   MountVolume.SetUp succeeded for volume "logs" 
  5m        5m      1   kubelet, master                         Normal      SuccessfulMountVolume   MountVolume.SetUp succeeded for volume "default-token-xxx" 
  5m        5m      2   kubelet, master                         Normal      SuccessfulMountVolume   MountVolume.SetUp succeeded for volume "nginx-conf" 
  5m        5m      1   kubelet, master spec.initContainers{wait-for-controller}    Normal      Pulling         pulling image "busybox"
  5m        5m      1   kubelet, master spec.initContainers{wait-for-controller}    Normal      Pulled          Successfully pulled image "busybox"
  5m        5m      1   kubelet, master spec.initContainers{wait-for-controller}    Normal      Created         Created container
  5m        5m      1   kubelet, master spec.initContainers{wait-for-controller}    Normal      Started         Started container
  3m        3m      4   kubelet, master spec.containers{nginx}              Normal      Pulled          Container image "centos/nginx-112-centos7@sha256:42330f7f29ba1ad67819f4ff3ae2472f62de13a827a74736a5098728462212e7" already present on machine
  3m        3m      4   kubelet, master spec.containers{nginx}              Normal      Created         Created container
  3m        3m      4   kubelet, master    spec.containers{nginx}               Normal      Started         Started container
  3m        41s     17  kubelet, master spec.containers{nginx}              Warning     BackOff         Back-off restarting failed container
[root@master ~]#

Is it possible that some image caching is tricking me? Since I gave the anyuid permission already.

These been the steps I ve fired as cluster-admin:

Thanks & BR

goldmann commented 6 years ago

What version of OpenShift you use? Is it OCP?

marziman commented 6 years ago

OpenShift Origin 3.7:

oc version
oc v3.7.1+a8deba5-34
kubernetes v1.7.6+a08f5eeb62
features: Basic-Auth GSSAPI Kerberos SPNEGO
Server https://192.168.xx.x:8443
openshift v3.7.1+a8deba5-34
kubernetes v1.7.6+a08f5eeb62
bbrowning commented 6 years ago

Looking at our nginx.conf, it looks like we only redirect access logs to the /logs storage but error logs go to the default location. Try editing the nginx.conf, adding the line error_log /logs/nginx_error.log; directly under the access_log line.

oc edit configmap nginx - add the line and save that file

Afterwards, the nginx pod should redeploy and error logs should get written to a writable place. Then, the underlying error being logged in the first place will hopefully show up.

marziman commented 6 years ago

Hi @bbrowning ,

I ve added the config change and still get this error:

nginx: [alert] could not open error log file: open() "/var/opt/rh/rh-nginx112/log/nginx/error.log" failed (13: Permission denied)
2018/03/28 14:22:50 [emerg] 1#0: mkdir() "/var/opt/rh/rh-nginx112/lib/nginx/tmp/client_body" failed (13: Permission denied)
marziman commented 6 years ago

Can it be something with missing permissions setting in the docker image you are using. It seems the image cant write to /nginx/tmp/... ?

marziman commented 6 years ago

Hi,

any news or ideas about this? The same setup steps runs well on minishift. It may be related to the anyuid?

Thanks & BR Mehmet

jcrossley3 commented 6 years ago

Can you try our previous nginx image on your platform and see if you get the same result? projectodd/whisk_nginx:cc60dfe. Also, can you share the action you're invoking that demonstrates the problem?

marziman commented 6 years ago

@jcrossley3 ,

thanks a lot. Switching to your image worked and all pods are up. I am trying to get the actions into play now. I think the other nginx image needs some modifications.

Thanks for your help and support. BR Mehmet

jcrossley3 commented 6 years ago

I'm ambivalent about that working. :)

jcrossley3 commented 6 years ago

We need to figure this out @goldmann

markito commented 6 years ago

Anything pending ton this issue ?

jcrossley3 commented 6 years ago

Now that we've reverted ImageStream use, I'm gonna close this, even though I have no idea whether that change is relevant to this. :)

Please re-open or create a new issue with steps to reproduce if the latest templates have the same issue.