abcdesktopio / conf

sample configuration and reference files and install scripts for docker and kubernetes
GNU General Public License v2.0
4 stars 1 forks source link

Release3.0 and Kubernetes v1.23 #4

Open drvgautam opened 11 months ago

drvgautam commented 11 months ago

Dear Alexandre, I needed to migrate my abcdesktop deployment from a DigitalOcean cluster to a Openstack-magnum based cluster with Kubernetes COE. Unfortunately, my infrastructure does not support cluster templates with K8 version =>1.24. So, I have only K8-v1.23. Sadly, the Openstack-Magnum is not known for keeping up with the kubernetes versions.

Based on (https://www.abcdesktop.io/2.0/features/), It is not very explicitly clear to me if I can deploy version3.0 on K8-v1.23 or not. Would you please clarify this?

best regards, Vinay

alexandredevely commented 11 months ago

Hello Vinay,

In release 2.0 an application runs as docker container or as kubernetes pod. So the pyos need the socket to dockerd. In release 3.0 the dependancy of dockerd has been remove, and all you need is a kubernetes cluster. An application runs as ephemeral container or as pod. Why did i change that ? because dockershim is removed by google https://kubernetes.io/blog/2020/12/02/dockershim-faq/

about your case : Ephemeral Containers are available as a beta feature in Kubernetes 1.23, it should work.

For you information https://endoflife.date/kubernetes Kubernetes 1.24 Ended (28 Jul 2023)

See you,

Alexandre

drvgautam commented 11 months ago

Hi Alexandre,

Thanks for the note. I tried deploying the abcdesktop with the openstack-magnum-k8 and it seems working, except that the nginx-od pod is repeatedly failing.

loginansatt01:~/abc-promo$ kubectl get pods -n abcdesktop
NAME                           READY   STATUS             RESTARTS       AGE
memcached-od-78578c879-bb8qq   1/1     Running            0              164m
mongodb-od-5b4dd4765f-ptw2j    1/1     Running            0              164m
nginx-od-788c97cdc9-b4gbq      0/1     CrashLoopBackOff   36 (57s ago)   164m
openldap-od-65759b74dc-tbvfg   1/1     Running            0              164m
pyos-od-7d5d9457cf-jw6nk       1/1     Running            0              164m
speedtest-od-c94b56c88-48cvq   1/1     Running            0              164m

The log read (attached below) shows an error cp: cannot create regular file '/etc/nginx/sites-enabled/default': Read-only file system. I checked the manifest file abcdesktop-3.0.yaml and mounted container path below. I understand that the filetype is set as readOnly:true, which is standard practice to minimize the attack surface for containerized workloads in Kubernetes.

- name: default-config
            mountPath: /etc/nginx/sites-enabled/default
            subPath: default
            readOnly: true

I could not resolve this issue, and I also remember that there was no such issue in the DO deployment. The logs of nginx-od are attached below. I would appreciate if you can see what's wrong here.

loginansatt01:~/abc-promo$ kubectl logs nginx-od-788c97cdc9-b4gbq -n abcdesktop
Resolver for nginx is 10.254.0.10
New site configuration
cp: cannot create regular file '/etc/nginx/sites-enabled/default': Read-only file system
lua_package_path "/usr/local/share/lua/5.1/?.lua;;";
types {
    # Web fonts
    application/font-woff2               woff2;
    application/-font-ttf                ttc ttf;
    font/opentype                        otf;
}
server {
    resolver 'kube-dns.kube-system.svc.cluster.local';
    set $my_speedtest 'speedtest.abcdesktop.svc.cluster.local';
    set $my_proxy 'pyos.abcdesktop.svc.cluster.local';
    listen 80;
    server_name _;
    root /var/webModules;
    index index.html index.htm;
    # default abcdesktop.io oc.user tcp port
    set $pulseaudio_http_port               4714;
    set $ws_tcp_bridge_tcp_port             6081;
    set $api_service_tcp_port               8000;
    set $filemanager_bridge_tcp_port        29780;
    set $xterm_tcp_port                     29781;
    set $printerfile_service_tcp_port       29782;
    set $file_service_tcp_port              29783;
    set $broadcast_tcp_port                 29784;
    set $lync_service_tcp_port              29785;
    set $spawner_service_tcp_port           29786;
    set $janus_service_tcp_port     29787; 
    # uncomment to use env var
    # set_by_lua  $filemanager_bridge_tcp_port 'return os.getenv("FILEMANAGER_BRIDGE_TCP_PORT")';
    # set_by_lua  $broadcast_tcp_port 'return os.getenv("BROADCAST_SERVICE_TCP_PORT")';
    # set_by_lua  $ws_tcp_bridge_tcp_port 'return os.getenv("WS_TCP_BRIDGE_SERVICE_TCP_PORT")';
    # set_by_lua  $spawner_service_tcp_port 'return os.getenv("SPAWNER_SERVICE_TCP_PORT")';
    # set_by_lua  $xterm_tcp_port 'return os.getenv("XTERM_TCP_PORT")';
    # set_by_lua  $file_service_tcp_port 'return os.getenv("FILE_SERVICE_TCP_PORT")';
    # set_by_lua  $pulseaudio_http_port 'return os.getenv("PULSEAUDIO_HTTP_PORT")';
    location /nstatus {
             # allow 127.0.0.1;
             # deny all;
             stub_status;
    }

    include route.conf;
}
running standart configuration file
alexandredevely commented 11 months ago

Hi Vinay,

I agree with your security point of view. The nginx configuration is generated to update the resolver ip address using a simple sed command

    RESOLVER=$(grep -m 1 nameserver /etc/resolv.conf | awk '{ print $2 }')
    echo Resolver for nginx is $RESOLVER
    cp /etc/nginx/sites-enabled/default /tmp/default
        sed -i "s/127.0.0.11/$RESOLVER/g" /tmp/default
    # sed -i "s/pyos/pyos.abcdesktop.svc.cluster.local./g" /tmp/default
    echo New site configuration
    cp  /tmp/default /etc/nginx/sites-enabled/default

Out-of-the-box, Nginx doesn't support using environment variables inside most configuration blocks. I will try to use the set_by_lua feature to build the nginx template config

Thank you for you secure point of view

See you

Alexandre

alexandredevely commented 11 months ago

Hi Vinay,

I agree with this issue. If we define the nginx config file as a configmap, we don't need to update it. I've fixed that point in abcdesktopio/oc.nginx:3.0 and abcdesktopio/oc.nginx:3.1 images

Please restart the nginx pod or deployment

kubectl rollout restart  deployment nginx-od -n abcdesktop

Let me know if this new image fix your issue

Thank you again for your message

best regards,

Alexandre

drvgautam commented 11 months ago

Hi Alexandre,

Thanks for looking into the issue. I do not see the previous error after restarting, but the nginx-pod restarts after every minute are so. The related pod's description and information is attached below. The nginx pod has restarted thrice over a period of 13 minutes.

memcached-od-78578c879-67t5v   1/1     Running   0               13m
mongodb-od-5b4dd4765f-dv8kl    1/1     Running   0               13m
nginx-od-788c97cdc9-jvt8d      1/1     Running   3 (4m44s ago)   13m
openldap-od-65759b74dc-5fgtg   1/1     Running   0               13m
pyos-od-7d5d9457cf-cdg9j       1/1     Running   0               13m
speedtest-od-c94b56c88-f7rtr   1/1     Running   0               13m
``
loginansatt01:~/abc-promo$ kubectl describe pod nginx-od-788c97cdc9-jvt8d -n abcdesktop
Name:             nginx-od-788c97cdc9-jvt8d
Namespace:        abcdesktop
Priority:         0
Service Account:  default
Node:             abc-promo-pr5do2wkkd56-node-0/10.0.0.145
Start Time:       Tue, 10 Oct 2023 14:55:42 +0200
Labels:           name=nginx-od
                  netpol/dns=true
                  netpol/memcached=true
                  netpol/ocuser=true
                  netpol/pyos=true
                  netpol/speedtest=true
                  pod-template-hash=788c97cdc9
                  run=nginx-od
                  type=frontend
Annotations:      cni.projectcalico.org/containerID: 8acd46f6080ec587237d30d7ba51cee3f176505cb966c8095f564fac9ce13e10
                  cni.projectcalico.org/podIP: 10.100.113.201/32
                  cni.projectcalico.org/podIPs: 10.100.113.201/32
                  kubernetes.io/psp: magnum.privileged
Status:           Running
IP:               10.100.113.201
IPs:
  IP:           10.100.113.201
Controlled By:  ReplicaSet/nginx-od-788c97cdc9
Containers:
  nginx:
    Container ID:   containerd://19908bf13a2b3960ae27fc48fd5e809ff03415d305909da5649c8aee34c5f615
    Image:          abcdesktopio/oc.nginx:3.0
    Image ID:       docker.io/abcdesktopio/oc.nginx@sha256:36615278cfdf45a278159b51876ecd4cdbbea73618d0ed6d80b87cb2e4346c4e
    Ports:          80/TCP, 443/TCP
    Host Ports:     0/TCP, 0/TCP
    State:          Running
      Started:      Tue, 10 Oct 2023 15:02:22 +0200
    Last State:     Terminated
      Reason:       Error
      Exit Code:    137
      Started:      Tue, 10 Oct 2023 14:59:46 +0200
      Finished:     Tue, 10 Oct 2023 15:02:13 +0200
    Ready:          True
    Restart Count:  2
    Limits:
      cpu:     2
      memory:  512Mi
    Requests:
      cpu:     250m
      memory:  64Mi
    Liveness:  http-get http://:80/ delay=0s timeout=1s period=10s #success=1 #failure=1
    Startup:   http-get http://:80/ delay=0s timeout=1s period=10s #success=1 #failure=5
    Environment:
      JWT_DESKTOP_PAYLOAD_PRIVATE_KEY:  /config.payload/abcdesktop_jwt_desktop_payload_private_key.pem
      JWT_DESKTOP_SIGNING_PUBLIC_KEY:   /config.signing/abcdesktop_jwt_desktop_signing_public_key.pem
      NODE_NAME:                         (v1:spec.nodeName)
      POD_NAME:                         nginx-od-788c97cdc9-jvt8d (v1:metadata.name)
      POD_NAMESPACE:                    abcdesktop (v1:metadata.namespace)
      POD_IP:                            (v1:status.podIP)
    Mounts:
      /config.payload from jwtpayloadkeys (ro)
      /config.signing from jwtsigningkeys (ro)
      /etc/nginx/sites-enabled/default from default-config (ro,path="default")
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-rpzmg (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  jwtsigningkeys:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  abcdesktopjwtdesktopsigning
    Optional:    false
  jwtpayloadkeys:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  abcdesktopjwtdesktoppayload
    Optional:    false
  default-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      nginx-config
    Optional:  false
  kube-api-access-rpzmg:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age                   From               Message
  ----     ------     ----                  ----               -------
  Normal   Scheduled  8m31s                 default-scheduler  Successfully assigned abcdesktop/nginx-od-788c97cdc9-jvt8d to abc-promo-pr5do2wkkd56-node-0
  Normal   Pulled     8m29s                 kubelet            Successfully pulled image "abcdesktopio/oc.nginx:3.0" in 1.258776329s (1.258805854s including waiting)
  Normal   Pulled     4m28s                 kubelet            Successfully pulled image "abcdesktopio/oc.nginx:3.0" in 1.669156359s (1.669175252s including waiting)
  Normal   Pulling    2m (x3 over 8m30s)    kubelet            Pulling image "abcdesktopio/oc.nginx:3.0"
  Normal   Pulled     114s                  kubelet            Successfully pulled image "abcdesktopio/oc.nginx:3.0" in 6.096894509s (6.096910502s including waiting)
  Normal   Created    112s (x3 over 8m29s)  kubelet            Created container nginx
  Normal   Started    112s (x3 over 8m29s)  kubelet            Started container nginx
  Warning  Unhealthy  36s (x3 over 5m7s)    kubelet            Liveness probe failed: Get "http://10.100.113.201:80/": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
  Normal   Killing    36s (x3 over 5m7s)    kubelet            Container nginx failed liveness probe, will be restarted
drvgautam commented 11 months ago

The following may or may not be a related issue, but the logs of pyos pod also show a PodSecurityPolicy error and prevent from login either anonymously or ladap-dummy-names.

A trailing end of the logs read is attached below.

loginansatt01:~/abc-promo$ kubectl logs pyos-od-7d5d9457cf-cdg9j -n abcdesktop
2023-10-10 13:15:01 rest [DEBUG  ] kubernetes.client.rest.request:ae2108e2-70e8-461c-a1d5-7000bae6ccef response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"pods \"anonymous-0634f\" is forbidden: PodSecurityPolicy: unable to admit pod: []","reason":"Forbidden","details":{"name":"anonymous-0634f","kind":"pods"},"code":403}

2023-10-10 13:15:01 orchestrator [ERROR  ] oc.od.orchestrator.ODOrchestratorKubernetes.createdesktop:ae2108e2-70e8-461c-a1d5-7000bae6ccef (403)
Reason: Forbidden
HTTP response headers: HTTPHeaderDict({'Audit-Id': 'bc76bd68-d508-4beb-8b83-ac3a51eb6e10', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Kubernetes-Pf-Flowschema-Uid': '6379e6a8-db71-41bd-bc37-03cc81a549cf', 'X-Kubernetes-Pf-Prioritylevel-Uid': '20c71495-2702-473e-b00d-27b0e473a170', 'Date': 'Tue, 10 Oct 2023 13:15:01 GMT', 'Content-Length': '246'})
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"pods \"anonymous-0634f\" is forbidden: PodSecurityPolicy: unable to admit pod: []","reason":"Forbidden","details":{"name":"anonymous-0634f","kind":"pods"},"code":403}

2023-10-10 13:15:01 composer [DEBUG  ] oc.od.composer.on_desktoplaunchprogress_info:ae2108e2-70e8-461c-a1d5-7000bae6ccef 
2023-10-10 13:15:01 composer [ERROR  ] oc.od.composer.opendesktop:ae2108e2-70e8-461c-a1d5-7000bae6ccef Cannot create a new desktop return desktop=e.Create pod failed Forbidden {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"pods \"anonymous-0634f\" is forbidden: PodSecurityPolicy: unable to admit pod: []","reason":"Forbidden","details":{"name":"anonymous-0634f","kind":"pods"},"code":403}

2023-10-10 13:15:01 od [INFO   ] __main__.trace_response:ae2108e2-70e8-461c-a1d5-7000bae6ccef /composer/launchdesktop b'{"status": 500, "result": null, "error": "e.Create pod failed Forbidden {\\"kind\\":\\"Status\\",\\"apiVersion\\":\\"v1\\",\\"metadata\\":{},\\"status\\":\\"Failure\\",\\"message\\":\\"pods \\\\\\"anonymous-0634f\\\\\\" is forbidden: PodSecurityPolicy: unable to admit pod: []\\",\\"reason\\":\\"Forbidden\\",\\"details\\":{\\"name\\":\\"anonymous-0634f\\",\\"kind\\":\\"pods\\"},\\"code\\":403}\\n"}'
2023-10-10 13:15:04 od [INFO   ] __main__.trace_request:anonymous /healthz
2023-10-10 13:15:14 od [INFO   ] __main__.trace_request:anonymous /healthz
2023-10-10 13:15:24 od [INFO   ] __main__.trace_request:anonymous /healthz
2023-10-10 13:15:34 od [INFO   ] __main__.trace_request:anonymous /healthz
2023-10-10 13:15:44 od [INFO   ] __main__.trace_request:anonymous /healthz
alexandredevely commented 11 months ago

Hi Vinay,

Nginx works fine. A pod Security Policy has denied create pod call.
Could you send me more details than PodSecurityPolicy: unable to admit pod: []","reason":"Forbidden" ? In most case, the error come from hostPath usage.

Alexandre

alexandredevely commented 11 months ago

Hello Vinay,

Could you please update the od.config file to set the option desktop.homedirectorytype to None

# desktop.homedirectorytype: 'persistentVolumeClaim'
# desktop.homedirectorytype: 'hostPath'
# desktop.homedirectorytype: None
desktop.homedirectorytype: None

Set desktop.homedirectorytype to None to create volume as emptyDir

Then, please update the config map abcdesktop-config and restart pyos pod.

kubectl create -n abcdesktop configmap abcdesktop-config --from-file=od.config -o yaml --dry-run=client | kubectl replace -n abcdesktop -f -
kubectl delete pods -l run=pyos-od -n abcdesktop

I hope that it will pass the kubernetes's pod security admission.

See you

Alexandre

drvgautam commented 11 months ago

Hi Alexandre,

Thanks for the help. The pyos pod also crashes after setting the desktop.homedirectorytype to None in the od.config. In fact, both the pods, nginx and pyos are behaving alike, except that the nginx pod's state changes to READY state for sometime and then crashes but the pyos never gets READY. Could it be that the Pod Security Policies set by my cluster admins prevent the pods? Have no idea, but I can ask them if that's the issue.

best regards, Vinay

The logs from pyos and nginx are attached below.

loginansatt01:~/abc-promo$ kubectl logs pyos-od-7d5d9457cf-b2pxw -n abcdesktop

2023-10-11 07:38:36,288 [INFO   ] oc.logging.init_logging: Initializing logging subsystem
2023-10-11 07:38:36,301 [INFO   ] oc.logging.load_config: Reading cherrypy configuration section 'global/logging': path = od.config
2023-10-11 07:38:36,485 [INFO   ] oc.logging.init_logging: Applying configuration
2023-10-11 07:38:36 settings [INFO   ] oc.od.settings.init:internal Init configuration start
2023-10-11 07:38:36 settings [INFO   ] oc.od.settings.load_config:internal Loading configuration file od.config
2023-10-11 07:38:36 settings [DEBUG  ] oc.od.settings.init_dock:internal 
2023-10-11 07:38:36 settings [DEBUG  ] oc.od.settings.init_dock:internal loading dock entry terminal
2023-10-11 07:38:36 settings [DEBUG  ] oc.od.settings.init_dock:internal loading dock entry webshell
2023-10-11 07:38:36 settings [INFO   ] oc.od.settings.init_defaulthostfqdn:internal default_host_url: http://localhost
2023-10-11 07:38:36 settings [INFO   ] oc.od.settings.init_defaulthostfqdn:internal default_server_ipaddr: 127.0.0.1
2023-10-11 07:38:36 settings [INFO   ] oc.od.settings.init_defaulthostfqdn:internal services http request denied: {'autologin': True}
2023-10-11 07:38:36 settings [DEBUG  ] oc.od.settings.parse_provider_configref:internal config planet as use configref_name=ldapconfig
2023-10-11 07:38:36 settings [DEBUG  ] oc.od.settings.parse_provider_configref:internal reading config_ref key planet
2023-10-11 07:38:36 settings [DEBUG  ] oc.od.settings.parse_provider_configref:internal apply update config to planet
2023-10-11 07:38:36 settings [DEBUG  ] oc.od.settings.init_config_stack:internal kubernetes default domain svc.cluster.local=svc.cluster.local
2023-10-11 07:38:36 settings [DEBUG  ] oc.od.settings.init_config_stack:internal abcdesktop domain=abcdesktop.svc.cluster.local
2023-10-11 07:38:36 settings [DEBUG  ] oc.od.settings.get_mongodburl:internal mongodburl is read as mongodb://pyos:Az4MeYWUjZDg4Zjhk@mongodb.abcdesktop.svc.cluster.local
2023-10-11 07:38:36 settings [DEBUG  ] oc.od.settings._resolv:internal trying to gethostbyname mongodb.abcdesktop.svc.cluster.local
2023-10-11 07:38:37 settings [INFO   ] oc.od.settings.get_mongodburl:internal host mongodb.abcdesktop.svc.cluster.local resolved as 10.254.4.250
2023-10-11 07:38:37 settings [INFO   ] oc.od.settings.get_mongodburl:internal mongodburl is set to mongodb://pyos:Az4MeYWUjZDg4Zjhk@mongodb.abcdesktop.svc.cluster.local
2023-10-11 07:38:37 settings [INFO   ] oc.od.settings.init_config_mongodb:internal MongoDB url: mongodb://pyos:Az4MeYWUjZDg4Zjhk@mongodb.abcdesktop.svc.cluster.local
2023-10-11 07:38:37 settings [INFO   ] oc.od.settings.init_config_fail2ban:internal Fail2ban config: {'enable': False, 'banexpireafterseconds': 600, 'failsbeforeban': 5, 'protectednetworks': ['192.168.1.0/24']}
2023-10-11 07:38:37 settings [DEBUG  ] oc.od.settings.init_config_memcached:internal memcachedserver is read as memcached.abcdesktop.svc.cluster.local
2023-10-11 07:38:37 settings [DEBUG  ] oc.od.settings._resolv:internal trying to gethostbyname memcached.abcdesktop.svc.cluster.local
2023-10-11 07:38:37 settings [INFO   ] oc.od.settings.init_config_memcached:internal host memcached.abcdesktop.svc.cluster.local resolved as 10.254.223.103
2023-10-11 07:38:37 settings [INFO   ] oc.od.settings.init_config_memcached:internal memcachedserver is set to memcached.abcdesktop.svc.cluster.local
2023-10-11 07:38:37 settings [INFO   ] oc.od.settings.init_config_memcached:internal memcached connection string is set to memcached.abcdesktop.svc.cluster.local:11211
2023-10-11 07:38:37 settings [DEBUG  ] oc.od.settings.init_desktop:internal 
2023-10-11 07:38:37 settings [INFO   ] oc.od.settings.init_websocketrouting:internal mode is http_origin
2023-10-11 07:38:37 settings [INFO   ] oc.od.settings.init:internal Init configuration done.
2023-10-11 07:38:37 services [INFO   ] oc.od.services.ODServices.init_messageinfo:internal 
2023-10-11 07:38:37 services [INFO   ] oc.od.services.ODServices.init_accounting:internal 
2023-10-11 07:38:37 services [INFO   ] oc.od.services.ODServices.init_datastore:internal 
2023-10-11 07:38:37 services [INFO   ] oc.od.services.ODServices.init_datacache:internal 
2023-10-11 07:38:37 services [INFO   ] oc.od.services.ODServices.init_auth:internal 
2023-10-11 07:38:37 authservice [INFO   ] oc.auth.authservice.ODAuthTool.__init__:internal Adding Auth manager external
2023-10-11 07:38:37 authservice [INFO   ] oc.auth.authservice.ODAuthTool.createmanager:internal createmanager name=external <class 'oc.auth.authservice.ODExternalAuthManager'>
2023-10-11 07:38:37 authservice [INFO   ] oc.auth.authservice.ODAuthTool.__init__:internal Adding Auth manager metaexplicit
2023-10-11 07:38:37 authservice [INFO   ] oc.auth.authservice.ODAuthTool.createmanager:internal createmanager name=metaexplicit <class 'oc.auth.authservice.ODExplicitMetaAuthManager'>
2023-10-11 07:38:37 authservice [INFO   ] oc.auth.authservice.ODAuthTool.__init__:internal Adding Auth manager explicit
2023-10-11 07:38:37 authservice [INFO   ] oc.auth.authservice.ODAuthTool.createmanager:internal createmanager name=explicit <class 'oc.auth.authservice.ODExplicitAuthManager'>
2023-10-11 07:38:37 authservice [INFO   ] oc.auth.authservice.ODExplicitAuthManager.initproviders:internal Adding provider name planet 
2023-10-11 07:38:37 authservice [DEBUG  ] oc.auth.authservice.ODExplicitAuthManager.createprovider:internal {'self': <oc.auth.authservice.ODExplicitAuthManager object at 0x7f91484d9d00>, 'name': 'planet', 'config': {'config_ref': 'ldapconfig', 'enabled': True, 'default': True, 'ldap_timeout': 15, 'ldap_protocol': 'ldap', 'ldap_basedn': 'ou=people,dc=planetexpress,dc=com', 'servers': ['openldap.abcdesktop.svc.cluster.local'], 'secure': False, 'serviceaccount': {'login': 'cn=admin,dc=planetexpress,dc=com', 'password': 'GoodNewsEveryone'}, 'policies': {'acls': None, 'rules': {'rule-dummy': {'conditions': [{'boolean': True, 'expected': True}], 'expected': True, 'label': 'labeltrue'}, 'rule-ship': {'conditions': [{'memberOf': 'cn=ship_crew,ou=people,dc=planetexpress,dc=com', 'expected': True}], 'expected': True, 'label': 'shipcrew'}, 'rule-test': {'conditions': [{'memberOf': 'cn=admin_staff,ou=people,dc=planetexpress,dc=com', 'expected': True}], 'expected': True, 'label': 'adminstaff'}}}}}
2023-10-11 07:38:37 authservice [INFO   ] oc.auth.authservice.ODLdapAuthProvider.__init__:internal 
2023-10-11 07:38:37 authservice [INFO   ] oc.auth.authservice.ODAuthTool.__init__:internal Adding Auth manager implicit
2023-10-11 07:38:37 authservice [INFO   ] oc.auth.authservice.ODAuthTool.createmanager:internal createmanager name=implicit <class 'oc.auth.authservice.ODImplicitAuthManager'>
2023-10-11 07:38:37 authservice [INFO   ] oc.auth.authservice.ODImplicitAuthManager.initproviders:internal Adding provider name anonymous 
2023-10-11 07:38:37 authservice [DEBUG  ] oc.auth.authservice.ODImplicitAuthManager.createprovider:internal {'self': <oc.auth.authservice.ODImplicitAuthManager object at 0x7f9148558370>, 'name': 'anonymous', 'config': {'displayname': 'Anonymous', 'caption': 'Have a look !', 'userid': 'anonymous', 'username': 'Anonymous', 'policies': {'acl': {'permit': ['all']}, 'rules': {'rule-net-home': {'conditions': [{'network': '10.0.0.0/8', 'expected': True}], 'expected': True, 'label': 'tennetwork'}}}}}
2023-10-11 07:38:37 services [INFO   ] oc.od.services.ODServices.init_internaldns:internal 
2023-10-11 07:38:37 services [INFO   ] oc.od.services.ODServices.init_jwtdesktop:internal 
2023-10-11 07:38:38 services [INFO   ] oc.od.services.ODServices.init_locator:internal 
Error Opening file /usr/share/GeoIP/GeoIPCity.dat
2023-10-11 07:38:38 locator [ERROR  ] oc.od.locator.__init__:internal GeoIP error in open file /usr/share/GeoIP/GeoIPCity.dat
2023-10-11 07:38:38 locator [ERROR  ] oc.od.locator.__init__:internal [Errno 2] No such file or directory: '/usr/share/GeoIP/GeoIPCity.dat'
2023-10-11 07:38:38 services [INFO   ] oc.od.services.ODServices.init_webrtc:internal 
2023-10-11 07:38:38 services [INFO   ] oc.od.services.ODServices.init_prelogin:internal 
2023-10-11 07:38:38 services [INFO   ] oc.od.services.ODServices.init_logmein:internal 
2023-10-11 07:38:38 fail2ban [DEBUG  ] oc.od.fail2ban.ODFail2ban.init_collection:internal fail2ban ipaddr
2023-10-11 07:38:38 datastore [DEBUG  ] oc.datastore.ODMongoDatastoreClient.createclient:internal databasename=fail2ban
2023-10-11 07:38:38 datastore [DEBUG  ] oc.datastore.ODMongoDatastoreClient.createclient:internal createclient MongoClient mongodb://pyos:Az4MeYWUjZDg4Zjhk@mongodb.abcdesktop.svc.cluster.local/fail2ban?authSource=fail2ban
2023-10-11 07:38:38 fail2ban [DEBUG  ] oc.od.fail2ban.ODFail2ban.init_collection:internal fail2ban login
2023-10-11 07:38:38 datastore [DEBUG  ] oc.datastore.ODMongoDatastoreClient.createclient:internal databasename=fail2ban
2023-10-11 07:38:38 datastore [DEBUG  ] oc.datastore.ODMongoDatastoreClient.createclient:internal createclient MongoClient mongodb://pyos:Az4MeYWUjZDg4Zjhk@mongodb.abcdesktop.svc.cluster.local/fail2ban?authSource=fail2ban
2023-10-11 07:38:38 services [INFO   ] oc.od.services.init_infra:internal 
2023-10-11 07:38:38 rest [DEBUG  ] kubernetes.client.rest.request:internal response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"nodes is forbidden: User \"system:serviceaccount:abcdesktop:pyos-serviceaccount\" cannot list resource \"nodes\" in API group \"\" at the cluster scope","reason":"Forbidden","details":{"kind":"nodes"},"code":403}

2023-10-11 07:38:38 orchestrator [ERROR  ] oc.od.orchestrator.ODOrchestratorKubernetes.is_list_node_enabled:internal (403)
Reason: Forbidden
HTTP response headers: HTTPHeaderDict({'Audit-Id': 'cf86e854-3930-43c5-b8fb-5045892e7e9d', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Content-Type-Options': 'nosniff', 'X-Kubernetes-Pf-Flowschema-Uid': '6379e6a8-db71-41bd-bc37-03cc81a549cf', 'X-Kubernetes-Pf-Prioritylevel-Uid': '20c71495-2702-473e-b00d-27b0e473a170', 'Date': 'Wed, 11 Oct 2023 07:38:38 GMT', 'Content-Length': '292'})
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"nodes is forbidden: User \"system:serviceaccount:abcdesktop:pyos-serviceaccount\" cannot list resource \"nodes\" in API group \"\" at the cluster scope","reason":"Forbidden","details":{"kind":"nodes"},"code":403}

2023-10-11 07:38:38 services [WARNING] oc.od.services.init_infra:internal Kubernetes service account can NOT query list_node
2023-10-11 07:38:38 services [INFO   ] oc.od.services.ODServices.init_applist:internal 
2023-10-11 07:38:38 datastore [DEBUG  ] oc.datastore.ODMongoDatastoreClient.createclient:internal databasename=applications
2023-10-11 07:38:38 datastore [DEBUG  ] oc.datastore.ODMongoDatastoreClient.createclient:internal createclient MongoClient mongodb://pyos:Az4MeYWUjZDg4Zjhk@mongodb.abcdesktop.svc.cluster.local/applications?authSource=applications
2023-10-11 07:38:38 apps [DEBUG  ] oc.od.apps.ODApps.cached_applist:internal 
2023-10-11 07:38:38 apps [DEBUG  ] oc.od.apps.ODApps.build_applist:internal 
2023-10-11 07:38:38 datastore [DEBUG  ] oc.datastore.ODMongoDatastoreClient.createclient:internal databasename=applications
2023-10-11 07:38:38 datastore [DEBUG  ] oc.datastore.ODMongoDatastoreClient.createclient:internal createclient MongoClient mongodb://pyos:Az4MeYWUjZDg4Zjhk@mongodb.abcdesktop.svc.cluster.local/applications?authSource=applications
2023-10-11 07:38:38 services [INFO   ] oc.od.services.ODServices.init_kuberneteswatcher:internal 
2023-10-11 07:38:38 kuberneteswatcher [DEBUG  ] oc.od.kuberneteswatcher.ODKubernetesWatcher.__init__:internal ODKubernetesWatcher use namespace=abcdesktop
2023-10-11 07:38:38 od [INFO   ] __main__.run_server:internal Starting cherrypy service...
2023-10-11 07:38:38 pyutils [DEBUG  ] oc.pyutils.import_classes:internal Loading module in directory ['/var/pyos/controllers']
2023-10-11 07:38:38 pyutils [DEBUG  ] oc.pyutils.import_classes:internal Importing module 'controllers.accounting_controller'
2023-10-11 07:38:39 pyutils [DEBUG  ] oc.pyutils.import_classes:internal Importing module 'controllers.auth_controller'
2023-10-11 07:38:42 pyutils [DEBUG  ] oc.pyutils.import_classes:internal Importing module 'controllers.composer_controller'
2023-10-11 07:38:43 pyutils [DEBUG  ] oc.pyutils.import_classes:internal Importing module 'controllers.core_controller'
2023-10-11 07:38:45 pyutils [DEBUG  ] oc.pyutils.import_classes:internal Importing module 'controllers.key_controller'
2023-10-11 07:38:45 pyutils [DEBUG  ] oc.pyutils.import_classes:internal Importing module 'controllers.manager_controller'
2023-10-11 07:38:46 pyutils [DEBUG  ] oc.pyutils.import_classes:internal Importing module 'controllers.store_controller'
2023-10-11 07:38:46 pyutils [DEBUG  ] oc.pyutils.import_classes:internal Importing module 'controllers.tracker_controller'
2023-10-11 07:38:46 pyutils [DEBUG  ] oc.pyutils.import_classes:internal Importing module 'controllers.user_controller'
2023-10-11 07:38:46 pyutils [DEBUG  ] oc.pyutils.import_classes:internal Importing module 'controllers.webrtc_controller'
2023-10-11 07:38:46 cherrypy [DEBUG  ] oc.cherrypy.create_controllers:internal instancing class AccountingController
2023-10-11 07:38:46 cherrypy [DEBUG  ] oc.cherrypy.create_controllers:internal instancing class AuthController
2023-10-11 07:38:46 auth_controller [INFO   ] controllers.auth_controller.AuthController.__init__:internal config_controller=None
2023-10-11 07:38:46 cherrypy [DEBUG  ] oc.cherrypy.create_controllers:internal instancing class ComposerController
2023-10-11 07:38:46 cherrypy [DEBUG  ] oc.cherrypy.create_controllers:internal instancing class CoreController
2023-10-11 07:38:46 cherrypy [DEBUG  ] oc.cherrypy.create_controllers:internal instancing class KeyController
2023-10-11 07:38:46 cherrypy [DEBUG  ] oc.cherrypy.create_controllers:internal instancing class ManagerController
2023-10-11 07:38:46 cherrypy [DEBUG  ] oc.cherrypy.create_controllers:internal instancing class StoreController
2023-10-11 07:38:46 cherrypy [DEBUG  ] oc.cherrypy.create_controllers:internal instancing class TrackerController
2023-10-11 07:38:46 cherrypy [DEBUG  ] oc.cherrypy.create_controllers:internal instancing class UserController
2023-10-11 07:38:46 cherrypy [DEBUG  ] oc.cherrypy.create_controllers:internal instancing class WebRTCController
2023-10-11 07:38:46 _cplogging [INFO   ] cherrypy.error.error:internal  ENGINE Bus STARTING
2023-10-11 07:38:46 od [DEBUG  ] __main__.start:anonymous ODCherryWatcher start events, skipping
2023-10-11 07:38:46 services [DEBUG  ] oc.od.services.ODServices.start:anonymous 
2023-10-11 07:38:46 kuberneteswatcher [DEBUG  ] oc.od.kuberneteswatcher.ODKubernetesWatcher.start:anonymous starting watcher thread
2023-10-11 07:38:46 kuberneteswatcher [DEBUG  ] oc.od.kuberneteswatcher.ODKubernetesWatcher.loopforevent:anonymous 
2023-10-11 07:38:46 kuberneteswatcher [DEBUG  ] oc.od.kuberneteswatcher.ODKubernetesWatcher.loopforevent:anonymous loopforevent start inifity loop
2023-10-11 07:38:46 _cplogging [INFO   ] cherrypy.error.error:anonymous  ENGINE Started monitor thread 'Autoreloader'.
2023-10-11 07:38:47 _cplogging [INFO   ] cherrypy.error.error:anonymous  ENGINE Serving on http://127.0.0.1:8080
2023-10-11 07:38:47 _cplogging [INFO   ] cherrypy.error.error:anonymous  ENGINE Serving on http://0.0.0.0:8000
2023-10-11 07:38:47 _cplogging [INFO   ] cherrypy.error.error:anonymous  ENGINE Bus STARTED
2023-10-11 07:38:47 od [INFO   ] __main__.run_server:anonymous Waiting for requests.

From the nginx pod


loginansatt01:~/abc-promo$ kubectl describe pod nginx-od-788c97cdc9-jvt8d -n abcdesktop
Name:             nginx-od-788c97cdc9-jvt8d
Namespace:        abcdesktop
Priority:         0
Service Account:  default
Node:             abc-promo-pr5do2wkkd56-node-0/10.0.0.145
Start Time:       Tue, 10 Oct 2023 14:55:42 +0200
Labels:           name=nginx-od
                  netpol/dns=true
                  netpol/memcached=true
                  netpol/ocuser=true
                  netpol/pyos=true
                  netpol/speedtest=true
                  pod-template-hash=788c97cdc9
                  run=nginx-od
                  type=frontend
Annotations:      cni.projectcalico.org/containerID: 8acd46f6080ec587237d30d7ba51cee3f176505cb966c8095f564fac9ce13e10
                  cni.projectcalico.org/podIP: 10.100.113.201/32
                  cni.projectcalico.org/podIPs: 10.100.113.201/32
                  kubernetes.io/psp: magnum.privileged
Status:           Running
IP:               10.100.113.201
IPs:
  IP:           10.100.113.201
Controlled By:  ReplicaSet/nginx-od-788c97cdc9
Containers:
  nginx:
    Container ID:   containerd://11e1aa6934765f8c508861319c15491230e717bec0f5e2aef235c9f50e2971b4
    Image:          abcdesktopio/oc.nginx:3.0
    Image ID:       docker.io/abcdesktopio/oc.nginx@sha256:36615278cfdf45a278159b51876ecd4cdbbea73618d0ed6d80b87cb2e4346c4e
    Ports:          80/TCP, 443/TCP
    Host Ports:     0/TCP, 0/TCP
    State:          Running
      Started:      Wed, 11 Oct 2023 09:38:06 +0200
    Last State:     Terminated
      Reason:       Error
      Exit Code:    137
      Started:      Wed, 11 Oct 2023 09:36:41 +0200
      Finished:     Wed, 11 Oct 2023 09:38:02 +0200
    Ready:          True
    Restart Count:  217
    Limits:
      cpu:     2
      memory:  512Mi
    Requests:
      cpu:     250m
      memory:  64Mi
    Liveness:  http-get http://:80/ delay=0s timeout=1s period=10s #success=1 #failure=1
    Startup:   http-get http://:80/ delay=0s timeout=1s period=10s #success=1 #failure=5
    Environment:
      JWT_DESKTOP_PAYLOAD_PRIVATE_KEY:  /config.payload/abcdesktop_jwt_desktop_payload_private_key.pem
      JWT_DESKTOP_SIGNING_PUBLIC_KEY:   /config.signing/abcdesktop_jwt_desktop_signing_public_key.pem
      NODE_NAME:                         (v1:spec.nodeName)
      POD_NAME:                         nginx-od-788c97cdc9-jvt8d (v1:metadata.name)
      POD_NAMESPACE:                    abcdesktop (v1:metadata.namespace)
      POD_IP:                            (v1:status.podIP)
    Mounts:
      /config.payload from jwtpayloadkeys (ro)
      /config.signing from jwtsigningkeys (ro)
      /etc/nginx/sites-enabled/default from default-config (ro,path="default")
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-rpzmg (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             True 
  ContainersReady   True 
  PodScheduled      True 
Volumes:
  jwtsigningkeys:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  abcdesktopjwtdesktopsigning
    Optional:    false
  jwtpayloadkeys:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  abcdesktopjwtdesktoppayload
    Optional:    false
  default-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      nginx-config
    Optional:  false
  kube-api-access-rpzmg:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   Burstable
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age                     From     Message
  ----     ------     ----                    ----     -------
  Warning  Unhealthy  13m (x215 over 18h)     kubelet  Liveness probe failed: Get "http://10.100.113.201:80/": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
  Warning  BackOff    8m45s (x2034 over 18h)  kubelet  Back-off restarting failed container
``
alexandredevely commented 11 months ago

Hello Vinay,

Pyos's pod says, it's ok

2023-10-11 07:38:47 od [INFO   ] __main__.run_server:anonymous Waiting for requests.

nginx's pod crashes, and it should not.

This can occurs if the FQDN's resolver name kube-dns.kube-system.svc.cluster.local can't resolve.

This should be

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: nginx-config
  namespace: abcdesktop
  labels:
    abcdesktop/role: nginx-config
data:
  default: | 
    lua_package_path "/usr/local/share/lua/5.1/?.lua;;";
    types {
        # Web fonts
        application/font-woff2               woff2;
        application/-font-ttf                ttc ttf;
        font/opentype                        otf;
    }
    server {
        resolver 'kube-dns.kube-system.svc.cluster.local';
        set $my_speedtest 'speedtest.abcdesktop.svc.cluster.local';
        set $my_proxy 'pyos.abcdesktop.svc.cluster.local';
        listen 80;
        server_name _;
        root /var/webModules;
        index index.html index.htm;
        # default abcdesktop.io oc.user tcp port
        set $pulseaudio_http_port               4714;
        set $ws_tcp_bridge_tcp_port             6081;
        set $api_service_tcp_port               8000;
        set $filemanager_bridge_tcp_port        29780;
        set $xterm_tcp_port                     29781;
        set $printerfile_service_tcp_port       29782;
        set $file_service_tcp_port              29783;
        set $broadcast_tcp_port                 29784;
        set $lync_service_tcp_port              29785;
        set $spawner_service_tcp_port           29786;
        set $janus_service_tcp_port     29787; 
        # uncomment to use env var
        # set_by_lua  $filemanager_bridge_tcp_port 'return os.getenv("FILEMANAGER_BRIDGE_TCP_PORT")';
        # set_by_lua  $broadcast_tcp_port 'return os.getenv("BROADCAST_SERVICE_TCP_PORT")';
        # set_by_lua  $ws_tcp_bridge_tcp_port 'return os.getenv("WS_TCP_BRIDGE_SERVICE_TCP_PORT")';
        # set_by_lua  $spawner_service_tcp_port 'return os.getenv("SPAWNER_SERVICE_TCP_PORT")';
        # set_by_lua  $xterm_tcp_port 'return os.getenv("XTERM_TCP_PORT")';
        # set_by_lua  $file_service_tcp_port 'return os.getenv("FILE_SERVICE_TCP_PORT")';
        # set_by_lua  $pulseaudio_http_port 'return os.getenv("PULSEAUDIO_HTTP_PORT")';
        location /nstatus {
                 # allow 127.0.0.1;
                 # deny all;
                 stub_status;
        }

        include route.conf;
    }
---

Thank you so much for your feedback and for this troubleshooting commands

Alexandre

drvgautam commented 11 months ago

Hi Alexandre,

Thanks for the help in troubleshooting the issue. I checked the kube-dns.kube-system.svn.cluster.local as below.

> nslookup kube-dns.kube-system.svc.cluster.local
Server:     10.254.0.10
Address:    10.254.0.10#53

I also checked and the CoreDNS configuration seems to be correct and is set up to handle DNS queries within the cluster.

loginansatt01:~/abc-promo$ kubectl get configmap coredns -n kube-system -o yaml
apiVersion: v1
data:
  Corefile: |
    .:53 {
        errors
        log stdout
        health
        kubernetes cluster.local 10.254.0.0/16 10.100.0.0/16 {
           pods verified
           fallthrough in-addr.arpa ip6.arpa
        }
        prometheus :9153
        forward . /etc/resolv.conf
        cache 30
        loop
        reload
        loadbalance
    }
kind: ConfigMap
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","data":{"Corefile":".:53 {\n    errors\n    log stdout\n    health\n    kubernetes cluster.local 10.254.0.0/16 10.100.0.0/16 {\n       pods verified\n       fallthrough in-addr.arpa ip6.arpa\n    }\n    prometheus :9153\n    forward . /etc/resolv.conf\n    cache 30\n    loop\n    reload\n    loadbalance\n}\n"},"kind":"ConfigMap","metadata":{"annotations":{},"name":"coredns","namespace":"kube-system"}}
  creationTimestamp: "2023-10-06T11:22:54Z"
  name: coredns
  namespace: kube-system
  resourceVersion: "301"
  uid: 647dd8ed-5ed1-4872-9819-47b3335bef16

My nginx config map is attached below

apiVersion: v1
kind: ConfigMap
metadata:
  name: nginx-config
  namespace: abcdesktop
  labels:
    abcdesktop/role: nginx-config
data:
  default: |
    lua_package_path "/usr/local/share/lua/5.1/?.lua;;";
    types {
        # Web fonts
        application/font-woff2               woff2;
        application/-font-ttf                ttc ttf;
        font/opentype                        otf;
    }
    server {
        resolver 'kube-dns.kube-system.svc.cluster.local';
        set $my_speedtest 'speedtest.abcdesktop.svc.cluster.local';
        set $my_proxy 'pyos.abcdesktop.svc.cluster.local';
        listen 80;
        server_name _;
        root /var/webModules;
        index index.html index.htm;
        # default abcdesktop.io oc.user tcp port
        set $pulseaudio_http_port               4714;
        set $ws_tcp_bridge_tcp_port             6081;
        set $api_service_tcp_port               8000;
        set $filemanager_bridge_tcp_port        29780;
        set $xterm_tcp_port                     29781;
        set $printerfile_service_tcp_port       29782;
        set $file_service_tcp_port              29783;
        set $broadcast_tcp_port                 29784;
        set $lync_service_tcp_port              29785;
        set $spawner_service_tcp_port           29786;
        set $janus_service_tcp_port             29787;
        # uncomment to use env var
        # set_by_lua  $filemanager_bridge_tcp_port 'return os.getenv("FILEMANAGER_BRIDGE_TCP_PORT")';
        # set_by_lua  $broadcast_tcp_port 'return os.getenv("BROADCAST_SERVICE_TCP_PORT")';
        # set_by_lua  $ws_tcp_bridge_tcp_port 'return os.getenv("WS_TCP_BRIDGE_SERVICE_TCP_PORT")';
        # set_by_lua  $spawner_service_tcp_port 'return os.getenv("SPAWNER_SERVICE_TCP_PORT")';
        # set_by_lua  $xterm_tcp_port 'return os.getenv("XTERM_TCP_PORT")';
        # set_by_lua  $file_service_tcp_port 'return os.getenv("FILE_SERVICE_TCP_PORT")';
        # set_by_lua  $pulseaudio_http_port 'return os.getenv("PULSEAUDIO_HTTP_PORT")';
        location /nstatus {
                 # allow 127.0.0.1;
                 # deny all;
                 stub_status;
        }

        include route.conf;
    }
alexandredevely commented 11 months ago

Hi Vinay,

I've checked your configuration file, but I don't know that's going wrong with the openstack-magnum-k8. So, I've wrote a new dedicated page for troubleshooting. It's here https://www.abcdesktop.io/3.0/setup/troubleshooting_core_services/

If the command kubectl logs -l run=nginx-od -n abcdesktop doesn't give you usable informations, we update the container to start sleep 1d command and then start nginx by hands.

Running nginx should you give us more details. https://www.abcdesktop.io/3.0/setup/troubleshooting_core_services/#start-the-pod-by-hands

The error.log for nginx is located in `/var/log/nginx/error.log'

I hope this troubleshooting page will give you some details to go in deep with the openstack-magnum-k8 issue.

Thank you for your help,

Alexandre

drvgautam commented 11 months ago

Hi Alexandre, Thanks for the note. I sought help from the admins of my openstack-magnum.
The following quick fix: setting up the pyos-role to use the 'magnum.privileged' PodSecurityPolicy that is automatically created by magnum allowed me to login anonymously. kubectl patch role pyos-role -n abcdesktop --type='json' -p='[{"op": "add", "path": "/rules/-", "value": {"apiGroups": ["policy"],"resources": ["podsecuritypolicies"],"verbs": ["use"],"resourceNames": ["magnum.privileged"]} }]'

The nginx logs (as described here: https://www.abcdesktop.io/3.0/setup/troubleshooting_core_services/#start-the-pod-by-hands) give the error: "nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)"

regards, Vinay

alexandredevely commented 11 months ago

Hi Vinay,

Thank you for reading the troubleshooting page. If the tcp port 80 is used, It seems that more than one nginx pod is running, or we are asking nginx to start in not debug deployment.

I has rewritten, with more step and comment ( I'm sorry for the confuse ) All new steps and commands are in bold.

I've replay this steps on my own server, I can confirm that you get a ready to run nginx pod.

Let's have a look

You patched the deployment.apps/nginx-od with the debug/nginx-3.0.yaml

kubectl apply -f https://raw.githubusercontent.com/abcdesktopio/conf/main/kubernetes/debug/nginx-3.0.yaml
deployment.apps/nginx-od configured

At this point, you can delete all nginx pods, to make sure that no one is still running

kubectl delete pods -l run=nginx-od -n abcdesktop
pod "nginx-od-666df64f4-whtng" deleted

You can have more than one pod deleted

A new one is created by kubernetes after few seconds

Then check that nginx pod has been updated and that the status is Running

The AGE shoud be recent ( few seconds )

few seconds is really important ;-)

kubectl get pods  -l run=nginx-od -n abcdesktop
NAME                       READY   STATUS    RESTARTS   AGE
nginx-od-666df64f4-whtng   1/1     Running   0         26s

Now, you can continue and run the command to start nginx manually.

Nginx web service is not started inside the container, only the pod is started. The TCP port 80 is free. We need to get a shell inside the container to start the nginx web service by hands.

Run the command /usr/local/openresty/nginx/sbin/nginx -p /etc/nginx -c nginx.conf -e /var/log/nginx/error.log

kubectl exec -n abcdesktop -it deployment/nginx-od -- bash
root@nginx-od-666df64f4-whtng:/# /usr/local/openresty/nginx/sbin/nginx -p /etc/nginx -c nginx.conf -e /var/log/nginx/error.log

At this point you should read the nginx error message, this is what we need to fix the issue.

Thank you again

Alexandre

drvgautam commented 11 months ago

Hi Alexandre, Thanks for adding more clarity to the the previous note. It seems something strange on my side. I still do not get the error.log from the debug/nginx-3.0.yaml.

loginansatt01:~/abcd-promo$ kubectl apply -f https://raw.githubusercontent.com/abcdesktopio/conf/main/kubernetes/debug/nginx-3.0.yaml
deployment.apps/nginx-od configured
loginansatt01:~/abcd-promo$ kubectl delete pods -l run=nginx-od -n abcdesktop
pod "nginx-od-788c97cdc9-kqsmr" deleted
pod "nginx-od-7dfb648c6c-lfbm2" deleted
loginansatt01:~/abcd-promo$ kubectl get pods -n abcdesktop
NAME                           READY   STATUS        RESTARTS   AGE
memcached-od-78578c879-h4xt2   1/1     Running       0          70m
mongodb-od-5b4dd4765f-ffdpx    1/1     Running       0          70m
nginx-od-788c97cdc9-bp9gk      0/1     Terminating   0          118s
nginx-od-7dfb648c6c-n56sg      1/1     Running       0          118s
openldap-od-65759b74dc-mbtqs   1/1     Running       0          70m
pyos-od-7d5d9457cf-8hw86       1/1     Running       0          70m
speedtest-od-c94b56c88-wzbx9   1/1     Running       0          70m
loginansatt01:~/abcd-promo$ kubectl get pods  -l run=nginx-od -n abcdesktop
NAME                        READY   STATUS    RESTARTS   AGE
nginx-od-7dfb648c6c-n56sg   1/1     Running   0          2m35s
loginansatt01:~/abcd-promo$ kubectl exec -n abcdesktop -it deployment/nginx-od -- bash
root@nginx-od-7dfb648c6c-n56sg:/# /usr/local/openresty/nginx/sbin/nginx -p /etc/nginx -c nginx.conf -e /var/log/nginx/error.log

--> There is no output here! When I kill the process (CTRL-Z) and try to read the log I get the following error message...


^Z
[1]+  Stopped                 /usr/local/openresty/nginx/sbin/nginx -p /etc/nginx -c nginx.conf -e /var/log/nginx/error.log
root@nginx-od-7dfb648c6c-n56sg:/# /usr/local/openresty/nginx/sbin/nginx -p /etc/nginx -c nginx.conf -e /var/log/nginx/error.log
nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)
nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)
nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)
nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)
nginx: [emerg] bind() to 0.0.0.0:80 failed (98: Address already in use)
nginx: [emerg] still could not bind()

Further, to debug..

root@nginx-od-7dfb648c6c-n56sg:/# ps -aux
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root           1  0.0  0.0   2788  1012 ?        Ss   14:41   0:00 /usr/bin/sleep 1d
root           7  0.0  0.0   4624  3840 pts/0    Ss   14:44   0:00 bash
root          15  0.0  0.2  22648  9332 pts/0    T    14:44   0:00 nginx: master process /usr/local/openresty/nginx/sbin/nginx -p /etc/nginx -c nginx.conf -e /var/log/nginx/error.lo
www-data      16  0.0  0.0  23032  3620 pts/0    T    14:44   0:00 nginx: worker process
www-data      17  0.0  0.0  23032  3620 pts/0    T    14:44   0:00 nginx: worker process
root          19  0.0  0.0   7064  1568 pts/0    R+   14:54   0:00 ps -aux
root@nginx-od-7dfb648c6c-n56sg:/#