eclipse-che / che

Kubernetes based Cloud Development Environments for Enterprise Teams
http://eclipse.org/che
Eclipse Public License 2.0
6.99k stars 1.19k forks source link

che workspace cant start successful #11107

Closed lf1029698952 closed 6 years ago

lf1029698952 commented 6 years ago

Description

Hello,I first time install eclipse che v6.10.0 on my kubernetes v1.7.11 by helm v2.9.1,when I start workspace, The mistake is as follows:

image

image

Error logs:

2018-09-07 06:39:28,424[default.svc/...]  [ERROR] [.i.k.KubernetesInternalRuntime 736]  - Unrecoverable event occurred during workspace 'workspace6h8jvqf539ve9cmy' startup: FailedScheduling, PersistentVolumeClaim is not bound: "claim-che-workspace" (repeated 44 times), workspace6h8jvqf539ve9cmy.dockerimage-3893794495-zzhkz
2018-09-07 06:39:28,427[aceSharedPool-6]  [WARN ] [.i.k.KubernetesInternalRuntime 194]  - Failed to start Kubernetes runtime of workspace workspace6h8jvqf539ve9cmy. Cause: Unrecoverable event occurred: 'FailedScheduling', 'PersistentVolumeClaim is not bound: "claim-che-workspace" (repeated 44 times)', 'workspace6h8jvqf539ve9cmy.dockerimage-3893794495-zzhkz'
2018-09-07 06:39:28,856[aceSharedPool-6]  [INFO ] [o.e.c.a.w.s.WorkspaceRuntimes 378]   - Workspace 'che:wksp-za9f' with id 'workspace6h8jvqf539ve9cmy' start failed

My k8s cluster have two storageclass but no default storageclass:

NAME       PROVISIONER         AGE
ceph-rbd   kubernetes.io/rbd   51d
cephfs     ceph.com/cephfs     51d

and I create a PV:

NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM                                             STORAGECLASS   REASON    AGE
claim-che-workspace-pv                     40Gi       RWO            Delete           Available                                                     ceph-rbd                 15m

I set CHE_INFRA_KUBERNETES_PVC_STRATEGY: common, how to bound the pvc claim-che-workspace for workspace? Thanks!

sleshchenko commented 6 years ago

Che Server expects PVC to be created or it will create a new one without specifying PV name. So, unfortunately, there is no way for Che to configure to use existing PV.

I can propose you to configure CHE_INFRA_KUBERNETES_NAMESPACE to create all workspaces objects there. And you can manually create needed PVC in configured namespace. Default PVC name that Che Servers expects is claim-che-workspace, but you can configure any one with CHE_INFRA_KUBERNETES_PVC_NAME.

lf1029698952 commented 6 years ago

my env CHE_INFRA_KUBERNETES_NAMESPACE: "" it will create pvc in a random namespaces:

workspace0osnn9v6v81bh0cg   claim-che-workspace                       Bound         claim-che-workspace-pv                     40Gi       RWO            ceph-rbd       23m
workspace6h8jvqf539ve9cmy   claim-che-workspace                       Pending                                                                                           8h
workspaceoi9qf7hstb4ov3ji   claim-che-workspace                       Bound         pvc-f5732d4b-b283-11e8-b3e0-fa163e3fc856   10Gi       RWO            ceph-rbd       18m

and it will logs: cant get secret workspace6h8jvqf539ve9cmy/ceph-secret I use ceph-rbd storageclass as pvc provisioner I need set env CHE_INFRA_KUBERNETES_NAMESPACE:"che"

eclipse     claim-che-workspace-lhj7wttg              Bound         pvc-6dc77976-b289-11e8-b9bb-fa163e771e20   10Gi       RWO            ceph-rbd       22s
eclipse     claim-che-workspace-vgw2plwx              Bound         pvc-6dca8a00-b289-11e8-b9bb-fa163e771e20   10Gi       RWO            ceph-rbd       22s

and workspace pod is running:

NAME                                                     READY     STATUS    RESTARTS   AGE       IP               NODE
che-3790992006-z88cb                                     1/1       Running   0          6m        172.29.178.24    172.26.5.94
workspace6h8jvqf539ve9cmy.dockerimage-3403373004-kzbdz   1/1       Running   0          3m        172.29.13.162    172.20.40.142
workspacefgqpfaims2fwnbb7.dockerimage-33172705-64p14     1/1       Running   0          12m       172.29.175.209   172.26.4.136

Thanks

sleshchenko commented 6 years ago

@lf1029698952 As far as I understand your issue is solved. Can you close the issue if it is?

lf1029698952 commented 6 years ago

thanks, but I have another problem, my workspace cant start successful, logs:

2018-09-10 09:02:48,229[nio-8080-exec-5]  [INFO ] [o.e.c.a.w.s.WorkspaceRuntimes 317]   - Starting workspace 'che/wksp-m37n' with id 'workspacefgqpfaims2fwnbb7' by user 'che'
2018-09-10 09:11:27,590[aceSharedPool-3]  [WARN ] [.i.k.KubernetesInternalRuntime 194]  - Failed to start Kubernetes runtime of workspace workspacefgqpfaims2fwnbb7. Cause: null
2018-09-10 09:11:28,119[aceSharedPool-3]  [INFO ] [o.e.c.a.w.s.WorkspaceRuntimes 378]   - Workspace 'che:wksp-m37n' with id 'workspacefgqpfaims2fwnbb7' start failed
2018-09-10 09:11:28,120[aceSharedPool-3]  [ERROR] [o.e.c.a.w.s.WorkspaceRuntimes 388]   - null
org.eclipse.che.api.workspace.server.spi.InternalInfrastructureException: null
    at org.eclipse.che.workspace.infrastructure.kubernetes.StartSynchronizer.getStartFailureNow(StartSynchronizer.java:275)
    at org.eclipse.che.workspace.infrastructure.kubernetes.KubernetesInternalRuntime.internalStart(KubernetesInternalRuntime.java:189)
    at org.eclipse.che.api.workspace.server.spi.InternalRuntime.start(InternalRuntime.java:146)
    at org.eclipse.che.api.workspace.server.WorkspaceRuntimes$StartRuntimeTask.run(WorkspaceRuntimes.java:354)
    at org.eclipse.che.commons.lang.concurrent.CopyThreadLocalRunnable.run(CopyThreadLocalRunnable.java:38)
    at java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1626)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.TimeoutException: null
    at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1771)
    at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915)
    at org.eclipse.che.workspace.infrastructure.kubernetes.KubernetesInternalRuntime.waitMachines(KubernetesInternalRuntime.java:254)
    at org.eclipse.che.workspace.infrastructure.kubernetes.KubernetesInternalRuntime.internalStart(KubernetesInternalRuntime.java:186)
    ... 7 common frames omitted
2018-09-10 09:11:28,126[aceSharedPool-3]  [WARN ] [o.e.c.a.w.s.WorkspaceManager 433]    - Cannot set error status of the workspace workspacefgqpfaims2fwnbb7. Error is: null

2018-09-10 10:20:32,884[nio-8080-exec-9]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 99]   - Processing messing within unidentified session
2018-09-10 10:20:38,760[io-8080-exec-11]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 135]  - Web socket session error
2018-09-10 10:20:38,760[nio-8080-exec-7]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 135]  - Web socket session error
2018-09-10 10:20:38,760[io-8080-exec-11]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 119]  - Closing unidentified session
2018-09-10 10:20:38,760[nio-8080-exec-7]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 119]  - Closing unidentified session
2018-09-10 10:20:38,765[nio-8080-exec-2]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 135]  - Web socket session error
2018-09-10 10:20:38,766[nio-8080-exec-2]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 119]  - Closing unidentified session
2018-09-10 10:20:38,767[io-8080-exec-10]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 135]  - Web socket session error
2018-09-10 10:20:38,767[nio-8080-exec-6]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 135]  - Web socket session error
2018-09-10 10:20:38,767[nio-8080-exec-4]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 135]  - Web socket session error
2018-09-10 10:20:38,767[io-8080-exec-10]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 119]  - Closing unidentified session
2018-09-10 10:20:38,767[nio-8080-exec-6]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 119]  - Closing unidentified session
2018-09-10 10:20:38,767[nio-8080-exec-4]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 119]  - Closing unidentified session
2018-09-10 10:20:38,773[io-8080-exec-11]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 135]  - Web socket session error
2018-09-10 10:20:38,773[nio-8080-exec-9]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 135]  - Web socket session error
2018-09-10 10:20:38,773[nio-8080-exec-7]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 135]  - Web socket session error
2018-09-10 10:20:38,773[nio-8080-exec-3]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 135]  - Web socket session error
2018-09-10 10:20:38,774[io-8080-exec-11]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 119]  - Closing unidentified session
2018-09-10 10:20:38,774[nio-8080-exec-1]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 135]  - Web socket session error
2018-09-10 10:20:38,774[nio-8080-exec-7]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 119]  - Closing unidentified session
2018-09-10 10:20:38,774[nio-8080-exec-3]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 119]  - Closing unidentified session
2018-09-10 10:20:38,774[nio-8080-exec-2]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 135]  - Web socket session error
2018-09-10 10:20:38,774[nio-8080-exec-1]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 119]  - Closing unidentified session
2018-09-10 10:20:38,774[nio-8080-exec-2]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 119]  - Closing unidentified session
2018-09-10 10:20:38,774[nio-8080-exec-9]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 119]  - Closing unidentified session
2018-09-10 10:20:38,777[nio-8080-exec-5]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 135]  - Web socket session error
2018-09-10 10:20:38,777[nio-8080-exec-4]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 135]  - Web socket session error
2018-09-10 10:20:38,777[nio-8080-exec-5]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 119]  - Closing unidentified session
2018-09-10 10:20:38,778[nio-8080-exec-4]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 119]  - Closing unidentified session
2018-09-10 10:20:52,563[nio-8080-exec-4]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 99]   - Processing messing within unidentified session
2018-09-10 10:21:02,564[nio-8080-exec-7]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 99]   - Processing messing within unidentified session
2018-09-10 10:21:12,566[io-8080-exec-10]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 99]   - Processing messing within unidentified session
2018-09-10 10:21:22,566[nio-8080-exec-6]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 99]   - Processing messing within unidentified session
2018-09-10 10:21:32,570[nio-8080-exec-3]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 99]   - Processing messing within unidentified session
2018-09-10 10:21:42,571[nio-8080-exec-2]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 99]   - Processing messing within unidentified session
2018-09-10 10:21:52,568[nio-8080-exec-5]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 99]   - Processing messing within unidentified session

image

sleshchenko commented 6 years ago

@lf1029698952 Could you provide events on Kubernetes in che namespace?

lf1029698952 commented 6 years ago
LAST SEEN   FIRST SEEN   COUNT     NAME                                                                      KIND                    SUBOBJECT                                    TYPE      REASON                  SOURCE                        MESSAGE
44m         44m          1         che-3790992006-b60rm.155301b3dadd88ca                                     Pod                     spec.containers{che}                         Normal    Killing                 kubelet, 172.20.40.141        Killing container with id docker://che:Need to kill Pod
44m         44m          1         che-3790992006.155301b2851fd1f0                                           ReplicaSet                                                           Normal    SuccessfulDelete        replicaset-controller         Deleted pod: che-3790992006-b60rm
44m         44m          1         che-835852772-ftgj7.155301ba114bbddf                                      Pod                                                                  Normal    Scheduled               default-scheduler             Successfully assigned che-835852772-ftgj7 to 172.20.40.141
44m         44m          1         che-835852772-ftgj7.155301bc38393d97                                      Pod                                                                  Normal    SuccessfulMountVolume   kubelet, 172.20.40.141        MountVolume.SetUp succeeded for volume "che-token-vnthd" 
43m         43m          1         che-835852772-ftgj7.155301c03f0fca0b                                      Pod                                                                  Normal    SuccessfulMountVolume   kubelet, 172.20.40.141        MountVolume.SetUp succeeded for volume "pvc-71610c46-b1be-11e8-b3e0-fa163e3fc856" 
43m         43m          1         che-835852772-ftgj7.155301c079a58407                                      Pod                                                                  Warning   FailedSync              kubelet, 172.20.40.141        Error syncing pod
43m         43m          1         che-835852772-ftgj7.155301c0e065c758                                      Pod                                                                  Normal    SandboxChanged          kubelet, 172.20.40.141        Pod sandbox changed, it will be killed and re-created.
43m         43m          1         che-835852772-ftgj7.155301c41a809335                                      Pod                     spec.initContainers{fmp-volume-permission}   Normal    Pulling                 kubelet, 172.20.40.141        pulling image "busybox"
43m         43m          1         che-835852772-ftgj7.155301c5548cafd3                                      Pod                     spec.initContainers{fmp-volume-permission}   Normal    Pulled                  kubelet, 172.20.40.141        Successfully pulled image "busybox"
43m         43m          1         che-835852772-ftgj7.155301c5608dbed2                                      Pod                     spec.initContainers{fmp-volume-permission}   Normal    Created                 kubelet, 172.20.40.141        Created container
43m         43m          1         che-835852772-ftgj7.155301c640994fcd                                      Pod                     spec.containers{che}                         Normal    Pulling                 kubelet, 172.20.40.141        pulling image "eclipse/che-server:latest"
43m         43m          1         che-835852772-ftgj7.155301c82ac2cee8                                      Pod                     spec.containers{che}                         Normal    Pulled                  kubelet, 172.20.40.141        Successfully pulled image "eclipse/che-server:latest"
43m         43m          1         che-835852772-ftgj7.155301c87a100f71                                      Pod                     spec.containers{che}                         Normal    Created                 kubelet, 172.20.40.141        Created container
43m         43m          1         che-835852772-ftgj7.155301c89d19aa3d                                      Pod                     spec.containers{che}                         Normal    Started                 kubelet, 172.20.40.141        Started container
44m         44m          1         che-835852772.155301ba10fffa60                                            ReplicaSet                                                           Normal    SuccessfulCreate        replicaset-controller         Created pod: che-835852772-ftgj7
44m         44m          1         che.155301b284c7c584                                                      Deployment                                                           Normal    ScalingReplicaSet       deployment-controller         Scaled down replica set che-3790992006 to 0
44m         44m          1         che.155301ba10643a4f                                                      Deployment                                                           Normal    ScalingReplicaSet       deployment-controller         Scaled up replica set che-835852772 to 1
35m         35m          1         claim-che-workspace-vl6fuqlo.155302328f8c2db3                             PersistentVolumeClaim                                                Normal    ProvisioningSucceeded   persistentvolume-controller   Successfully provisioned volume pvc-f0fc9af2-b4e1-11e8-b9bb-fa163e771e20 using kubernetes.io/rbd
35m         35m          1         claim-che-workspace-ybrs372r.155302328e1c7e60                             PersistentVolumeClaim                                                Normal    ProvisioningSucceeded   persistentvolume-controller   Successfully provisioned volume pvc-f0fb120f-b4e1-11e8-b9bb-fa163e771e20 using kubernetes.io/rbd
4m          35m          5         ingress0e7czu0s.155302328bd31e9a                                          Ingress                                                              Normal    CREATE                  nginx-ingress-controller      Ingress eclipse/ingress0e7czu0s
35m         35m          1         ingress0e7czu0s.1553023a2e10504f                                          Ingress                                                              Normal    UPDATE                  nginx-ingress-controller      Ingress eclipse/ingress0e7czu0s
4m          27m          5         ingress0e7czu0s.155302aa2ae14fd0                                          Ingress                                                              Normal    DELETE                  nginx-ingress-controller      Ingress eclipse/ingress0e7czu0s
37m         37m          1         ingress2686cqqe.1553021a79d58055                                          Ingress                                                              Normal    CREATE                  nginx-ingress-controller      Ingress eclipse/ingress2686cqqe
37m         37m          1         ingress2686cqqe.1553021e3effb20f                                          Ingress                                                              Normal    UPDATE                  nginx-ingress-controller      Ingress eclipse/ingress2686cqqe
29m         29m          1         ingress2686cqqe.1553028e630956ca                                          Ingress                                                              Normal    DELETE                  nginx-ingress-controller      Ingress eclipse/ingress2686cqqe
37m         37m          1         ingress5enmenj8.1553021a780be6f7                                          Ingress                                                              Normal    CREATE                  nginx-ingress-controller      Ingress eclipse/ingress5enmenj8
37m         37m          1         ingress5enmenj8.1553021e3f117eed                                          Ingress                                                              Normal    UPDATE                  nginx-ingress-controller      Ingress eclipse/ingress5enmenj8
29m         29m          1         ingress5enmenj8.1553028e63bcd6f7                                          Ingress                                                              Normal    DELETE                  nginx-ingress-controller      Ingress eclipse/ingress5enmenj8
4m          35m          5         ingress66h0ntm0.155302328962a7b3                                          Ingress                                                              Normal    CREATE                  nginx-ingress-controller      Ingress eclipse/ingress66h0ntm0
35m         35m          1         ingress66h0ntm0.1553023a2e11c35c                                          Ingress                                                              Normal    UPDATE                  nginx-ingress-controller      Ingress eclipse/ingress66h0ntm0
4m          27m          5         ingress66h0ntm0.155302aa2b7bc23e                                          Ingress                                                              Normal    DELETE                  nginx-ingress-controller      Ingress eclipse/ingress66h0ntm0
37m         37m          1         ingress6d5w7x77.1553021a787d543a                                          Ingress                                                              Normal    CREATE                  nginx-ingress-controller      Ingress eclipse/ingress6d5w7x77
37m         37m          1         ingress6d5w7x77.1553021e3eed6aaf                                          Ingress                                                              Normal    UPDATE                  nginx-ingress-controller      Ingress eclipse/ingress6d5w7x77
29m         29m          1         ingress6d5w7x77.1553028e647b13ba                                          Ingress                                                              Normal    DELETE                  nginx-ingress-controller      Ingress eclipse/ingress6d5w7x77
37m         37m          1         ingressa5pak1pr.1553021a780e208d                                          Ingress                                                              Normal    CREATE                  nginx-ingress-controller      Ingress eclipse/ingressa5pak1pr
37m         37m          1         ingressa5pak1pr.1553021e3f0dd970                                          Ingress                                                              Normal    UPDATE                  nginx-ingress-controller      Ingress eclipse/ingressa5pak1pr
29m         29m          1         ingressa5pak1pr.1553028e650c907c                                          Ingress                                                              Normal    DELETE                  nginx-ingress-controller      Ingress eclipse/ingressa5pak1pr
35m         35m          1         ingressgzy0g5qu.1553023289fe79ba                                          Ingress                                                              Normal    CREATE                  nginx-ingress-controller      Ingress eclipse/ingressgzy0g5qu
35m         35m          1         ingressgzy0g5qu.1553023a2e088150                                          Ingress                                                              Normal    UPDATE                  nginx-ingress-controller      Ingress eclipse/ingressgzy0g5qu
27m         27m          1         ingressgzy0g5qu.155302aa2c186927                                          Ingress                                                              Normal    DELETE                  nginx-ingress-controller      Ingress eclipse/ingressgzy0g5qu
4m          35m          5         ingressnhtj0x0m.155302328b2fec5e                                          Ingress                                                              Normal    CREATE                  nginx-ingress-controller      Ingress eclipse/ingressnhtj0x0m
35m         35m          1         ingressnhtj0x0m.1553023a2e0df79b                                          Ingress                                                              Normal    UPDATE                  nginx-ingress-controller      Ingress eclipse/ingressnhtj0x0m
4m          27m          5         ingressnhtj0x0m.155302aa2cc2312f                                          Ingress                                                              Normal    DELETE                  nginx-ingress-controller      Ingress eclipse/ingressnhtj0x0m
35m         35m          1         ingressrgnyexq3.1553023285f373a7                                          Ingress                                                              Normal    CREATE                  nginx-ingress-controller      Ingress eclipse/ingressrgnyexq3
35m         35m          1         ingressrgnyexq3.1553023a2d8db459                                          Ingress                                                              Normal    UPDATE                  nginx-ingress-controller      Ingress eclipse/ingressrgnyexq3
27m         27m          1         ingressrgnyexq3.155302aa2d51bc99                                          Ingress                                                              Normal    DELETE                  nginx-ingress-controller      Ingress eclipse/ingressrgnyexq3
35m         35m          1         ingresss3iyhx0m.1553023287cde4d4                                          Ingress                                                              Normal    CREATE                  nginx-ingress-controller      Ingress eclipse/ingresss3iyhx0m
35m         35m          1         ingresss3iyhx0m.1553023a2e0b06b1                                          Ingress                                                              Normal    UPDATE                  nginx-ingress-controller      Ingress eclipse/ingresss3iyhx0m
27m         27m          1         ingresss3iyhx0m.155302aa2de04d79                                          Ingress                                                              Normal    DELETE                  nginx-ingress-controller      Ingress eclipse/ingresss3iyhx0m
35m         35m          1         ingresst0jtz18x.155302328a90eb7d                                          Ingress                                                              Normal    CREATE                  nginx-ingress-controller      Ingress eclipse/ingresst0jtz18x
35m         35m          1         ingresst0jtz18x.1553023a2e0c9491                                          Ingress                                                              Normal    UPDATE                  nginx-ingress-controller      Ingress eclipse/ingresst0jtz18x
27m         27m          1         ingresst0jtz18x.155302aa2e76a9ee                                          Ingress                                                              Normal    DELETE                  nginx-ingress-controller      Ingress eclipse/ingresst0jtz18x
37m         37m          1         ingresstihcwba7.1553021a780fde9e                                          Ingress                                                              Normal    CREATE                  nginx-ingress-controller      Ingress eclipse/ingresstihcwba7
37m         37m          1         ingresstihcwba7.1553021e3efdeb7e                                          Ingress                                                              Normal    UPDATE                  nginx-ingress-controller      Ingress eclipse/ingresstihcwba7
29m         29m          1         ingresstihcwba7.1553028e65940ed9                                          Ingress                                                              Normal    DELETE                  nginx-ingress-controller      Ingress eclipse/ingresstihcwba7
37m         37m          1         ingressuv5x75ys.1553021a753f6cca                                          Ingress                                                              Normal    CREATE                  nginx-ingress-controller      Ingress eclipse/ingressuv5x75ys
37m         37m          1         ingressuv5x75ys.1553021e3f141a86                                          Ingress                                                              Normal    UPDATE                  nginx-ingress-controller      Ingress eclipse/ingressuv5x75ys
29m         29m          1         ingressuv5x75ys.1553028e66327fd1                                          Ingress                                                              Normal    DELETE                  nginx-ingress-controller      Ingress eclipse/ingressuv5x75ys
37m         37m          1         ingressxz4rxvz1.1553021a79456dda                                          Ingress                                                              Normal    CREATE                  nginx-ingress-controller      Ingress eclipse/ingressxz4rxvz1
37m         37m          1         ingressxz4rxvz1.1553021e3f0f6679                                          Ingress                                                              Normal    UPDATE                  nginx-ingress-controller      Ingress eclipse/ingressxz4rxvz1
29m         29m          1         ingressxz4rxvz1.1553028e66c9dfbf                                          Ingress                                                              Normal    DELETE                  nginx-ingress-controller      Ingress eclipse/ingressxz4rxvz1
35m         35m          1         workspacej14tq05vdvio8t4f.dockerimage-1944555896-hgkb2.1553023a6059c832   Pod                                                                  Normal    Scheduled               default-scheduler             Successfully assigned workspacej14tq05vdvio8t4f.dockerimage-1944555896-hgkb2 to 172.26.5.94
35m         35m          1         workspacej14tq05vdvio8t4f.dockerimage-1944555896-hgkb2.1553023a73682dc9   Pod                                                                  Normal    SuccessfulMountVolume   kubelet, 172.26.5.94          MountVolume.SetUp succeeded for volume "default-token-j290p" 
35m         35m          1         workspacej14tq05vdvio8t4f.dockerimage-1944555896-hgkb2.1553023baec114d4   Pod                                                                  Normal    SuccessfulMountVolume   kubelet, 172.26.5.94          MountVolume.SetUp succeeded for volume "pvc-f0fc9af2-b4e1-11e8-b9bb-fa163e771e20" 
35m         35m          2         workspacej14tq05vdvio8t4f.dockerimage-1944555896-hgkb2.1553023baee8b900   Pod                                                                  Normal    SuccessfulMountVolume   kubelet, 172.26.5.94          MountVolume.SetUp succeeded for volume "pvc-f0fb120f-b4e1-11e8-b9bb-fa163e771e20" 
35m         35m          1         workspacej14tq05vdvio8t4f.dockerimage-1944555896-hgkb2.1553023c0ac33eb0   Pod                     spec.containers{container}                   Normal    Pulling                 kubelet, 172.26.5.94          pulling image "eclipse/ubuntu_jdk8"
33m         33m          1         workspacej14tq05vdvio8t4f.dockerimage-1944555896-hgkb2.155302517d321d7d   Pod                     spec.containers{container}                   Normal    Pulled                  kubelet, 172.26.5.94          Successfully pulled image "eclipse/ubuntu_jdk8"
33m         33m          1         workspacej14tq05vdvio8t4f.dockerimage-1944555896-hgkb2.1553025189b9016a   Pod                     spec.containers{container}                   Normal    Created                 kubelet, 172.26.5.94          Created container
33m         33m          1         workspacej14tq05vdvio8t4f.dockerimage-1944555896-hgkb2.155302519b459e8a   Pod                     spec.containers{container}                   Normal    Started                 kubelet, 172.26.5.94          Started container
27m         27m          2         workspacej14tq05vdvio8t4f.dockerimage-1944555896-hgkb2.155302aa4a818989   Pod                     spec.containers{container}                   Normal    Killing                 kubelet, 172.26.5.94          Killing container with id docker://container:Need to kill Pod
35m         35m          1         workspacej14tq05vdvio8t4f.dockerimage-1944555896.1553023a5fbfd2eb         ReplicaSet                                                           Normal    SuccessfulCreate        replicaset-controller         Created pod: workspacej14tq05vdvio8t4f.dockerimage-1944555896-hgkb2
27m         27m          1         workspacej14tq05vdvio8t4f.dockerimage-1944555896.155302aa3a0ce135         ReplicaSet                                                           Normal    SuccessfulDelete        replicaset-controller         Deleted pod: workspacej14tq05vdvio8t4f.dockerimage-1944555896-hgkb2
35m         35m          1         workspacej14tq05vdvio8t4f.dockerimage.1553023a5b580d94                    Deployment                                                           Normal    ScalingReplicaSet       deployment-controller         Scaled up replica set workspacej14tq05vdvio8t4f.dockerimage-1944555896 to 1
27m         27m          1         workspacej14tq05vdvio8t4f.dockerimage.155302aa39789c37                    Deployment                                                           Normal    ScalingReplicaSet       deployment-controller         Scaled down replica set workspacej14tq05vdvio8t4f.dockerimage-1944555896 to 0
37m         37m          1         workspaceucjqlxqqi9v5kxg3.dockerimage-3972290197-fkfvr.1553021e8e08aad1   Pod                                                                  Normal    Scheduled               default-scheduler             Successfully assigned workspaceucjqlxqqi9v5kxg3.dockerimage-3972290197-fkfvr to 172.20.40.104
36m         36m          1         workspaceucjqlxqqi9v5kxg3.dockerimage-3972290197-fkfvr.155302207f826fb5   Pod                                                                  Normal    SuccessfulMountVolume   kubelet, 172.20.40.104        MountVolume.SetUp succeeded for volume "default-token-j290p" 
36m         36m          1         workspaceucjqlxqqi9v5kxg3.dockerimage-3972290197-fkfvr.15530223809ef30a   Pod                                                                  Normal    SuccessfulMountVolume   kubelet, 172.20.40.104        MountVolume.SetUp succeeded for volume "pvc-331b7baa-b28a-11e8-b9bb-fa163e771e20" 
36m         36m          1         workspaceucjqlxqqi9v5kxg3.dockerimage-3972290197-fkfvr.15530223f5c619d5   Pod                                                                  Normal    SuccessfulMountVolume   kubelet, 172.20.40.104        MountVolume.SetUp succeeded for volume "pvc-331cc24f-b28a-11e8-b9bb-fa163e771e20" 
36m         36m          1         workspaceucjqlxqqi9v5kxg3.dockerimage-3972290197-fkfvr.155302262f6bd990   Pod                     spec.containers{container}                   Normal    Pulling                 kubelet, 172.20.40.104        pulling image "eclipse/ubuntu_jdk8"
35m         35m          1         workspaceucjqlxqqi9v5kxg3.dockerimage-3972290197-fkfvr.155302355193d0f2   Pod                     spec.containers{container}                   Normal    Pulled                  kubelet, 172.20.40.104        Successfully pulled image "eclipse/ubuntu_jdk8"
35m         35m          1         workspaceucjqlxqqi9v5kxg3.dockerimage-3972290197-fkfvr.155302357334cb26   Pod                     spec.containers{container}                   Normal    Created                 kubelet, 172.20.40.104        Created container
29m         29m          1         workspaceucjqlxqqi9v5kxg3.dockerimage-3972290197-fkfvr.1553028f1485c323   Pod                     spec.containers{container}                   Normal    Killing                 kubelet, 172.20.40.104        Killing container with id docker://container:Need to kill Pod
37m         37m          1         workspaceucjqlxqqi9v5kxg3.dockerimage-3972290197.1553021e8db42e9b         ReplicaSet                                                           Normal    SuccessfulCreate        replicaset-controller         Created pod: workspaceucjqlxqqi9v5kxg3.dockerimage-3972290197-fkfvr
29m         29m          1         workspaceucjqlxqqi9v5kxg3.dockerimage-3972290197.1553028e7b01f6fb         ReplicaSet                                                           Normal    SuccessfulDelete        replicaset-controller         Deleted pod: workspaceucjqlxqqi9v5kxg3.dockerimage-3972290197-fkfvr
37m         37m          1         workspaceucjqlxqqi9v5kxg3.dockerimage.1553021e89afcd8d                    Deployment                                                           Normal    ScalingReplicaSet       deployment-controller         Scaled up replica set workspaceucjqlxqqi9v5kxg3.dockerimage-3972290197 to 1
29m         29m          1         workspaceucjqlxqqi9v5kxg3.dockerimage.1553028e7a8dbd67                    Deployment                                                           Normal    ScalingReplicaSet       deployment-controller         Scaled down replica set workspaceucjqlxqqi9v5kxg3.dockerimage-3972290197 to 0
sleshchenko commented 6 years ago

@lf1029698952 It is still unclear why the error happens =(

@eivantsov Could you please take a look. Maybe you have an idea what is wrong.

Debugging could help, or Che Servers logs may have enough information after adding debug logging on each step of K8s/OS runtime start https://github.com/eclipse/che/issues/10408

lf1029698952 commented 6 years ago

@sleshchenko thanks when I set multiuser: true, three pods are running

NAME                        READY     STATUS    RESTARTS   AGE       IP               NODE
che-1481075420-kpg16        1/1       Running   0          18m       172.29.177.224   172.20.40.103
keycloak-3433983536-fj08s   1/1       Running   0          2m        172.29.202.237   172.20.40.104
postgres-1359352287-mkc7n   1/1       Running   0          26m       172.29.202.94    172.20.40.104

dashboard show: Authorization token is missed, Click here to reload page. image

pod: che and postgres logs looks nomal, keycload logs:

09:07:50,127 INFO  [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 53) WFLYCLINF0002: Started offlineSessions cache from keycloak container
09:07:50,127 INFO  [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 51) WFLYCLINF0002: Started sessions cache from keycloak container
09:07:50,127 INFO  [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 54) WFLYCLINF0002: Started clientSessions cache from keycloak container
09:07:50,128 INFO  [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 49) WFLYCLINF0002: Started realms cache from keycloak container
09:07:50,128 INFO  [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 58) WFLYCLINF0002: Started client-mappings cache from ejb container
09:07:50,242 ERROR [io.undertow] (MSC service thread 1-4) UT005024: Could not register resource change listener for caching resource manager, automatic invalidation of cached resource will not work: java.lang.RuntimeException: java.io.IOException: User limit of inotify instances reached or too many open files
    at org.xnio.nio.WatchServiceFileSystemWatcher.<init>(WatchServiceFileSystemWatcher.java:75)
    at org.xnio.nio.NioXnio.createFileSystemWatcher(NioXnio.java:236)
    at io.undertow.server.handlers.resource.PathResourceManager.registerResourceChangeListener(PathResourceManager.java:235)
    at org.wildfly.extension.undertow.deployment.ServletResourceManager.registerResourceChangeListener(ServletResourceManager.java:103)
    at io.undertow.server.handlers.resource.CachingResourceManager.<init>(CachingResourceManager.java:64)
    at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService.createServletConfig(UndertowDeploymentInfoService.java:580)
    at org.wildfly.extension.undertow.deployment.UndertowDeploymentInfoService.start(UndertowDeploymentInfoService.java:273)
    at org.jboss.msc.service.ServiceControllerImpl$StartTask.startService(ServiceControllerImpl.java:2032)
    at org.jboss.msc.service.ServiceControllerImpl$StartTask.run(ServiceControllerImpl.java:1955)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: User limit of inotify instances reached or too many open files
    at sun.nio.fs.LinuxWatchService.<init>(LinuxWatchService.java:64)
    at sun.nio.fs.LinuxFileSystem.newWatchService(LinuxFileSystem.java:47)
    at org.xnio.nio.WatchServiceFileSystemWatcher.<init>(WatchServiceFileSystemWatcher.java:73)
    ... 11 more

09:07:51,049 INFO  [org.keycloak.services] (ServerService Thread Pool -- 56) KC-SERVICES0001: Loading config from standalone.xml or domain.xml
09:07:51,449 INFO  [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 56) WFLYCLINF0002: Started realmRevisions cache from keycloak container
09:07:51,453 INFO  [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 56) WFLYCLINF0002: Started userRevisions cache from keycloak container
09:07:51,469 INFO  [org.jboss.as.clustering.infinispan] (ServerService Thread Pool -- 56) WFLYCLINF0002: Started authorizationRevisions cache from keycloak container
09:07:51,470 INFO  [org.keycloak.connections.infinispan.DefaultInfinispanConnectionProviderFactory] (ServerService Thread Pool -- 56) Node name: keycloak-3433983536-fj08s, Site name: null
09:07:57,513 INFO  [org.hibernate.jpa.internal.util.LogHelper] (ServerService Thread Pool -- 56) HHH000204: Processing PersistenceUnitInfo [
    name: keycloak-default
    ...]
09:07:57,568 INFO  [org.hibernate.Version] (ServerService Thread Pool -- 56) HHH000412: Hibernate Core {5.1.10.Final}
09:07:57,570 INFO  [org.hibernate.cfg.Environment] (ServerService Thread Pool -- 56) HHH000206: hibernate.properties not found
09:07:57,572 INFO  [org.hibernate.cfg.Environment] (ServerService Thread Pool -- 56) HHH000021: Bytecode provider name : javassist
09:07:57,603 INFO  [org.hibernate.annotations.common.Version] (ServerService Thread Pool -- 56) HCANN000001: Hibernate Commons Annotations {5.0.1.Final}
09:07:57,740 INFO  [org.hibernate.dialect.Dialect] (ServerService Thread Pool -- 56) HHH000400: Using dialect: org.hibernate.dialect.H2Dialect
09:07:57,746 WARN  [org.hibernate.dialect.H2Dialect] (ServerService Thread Pool -- 56) HHH000431: Unable to determine H2 database version, certain features may not work
09:07:57,792 INFO  [org.hibernate.envers.boot.internal.EnversServiceImpl] (ServerService Thread Pool -- 56) Envers integration enabled? : true
09:07:58,384 INFO  [org.hibernate.validator.internal.util.Version] (ServerService Thread Pool -- 56) HV000001: Hibernate Validator 5.3.5.Final
09:07:59,201 INFO  [org.hibernate.hql.internal.QueryTranslatorFactoryInitiator] (ServerService Thread Pool -- 56) HHH000397: Using ASTQueryTranslatorFactory
09:08:00,284 INFO  [org.keycloak.exportimport.dir.DirImportProvider] (ServerService Thread Pool -- 56) Importing from directory /scripts
09:08:00,553 INFO  [org.keycloak.services] (ServerService Thread Pool -- 56) KC-SERVICES0030: Full model import requested. Strategy: IGNORE_EXISTING
09:08:00,713 INFO  [org.keycloak.exportimport.util.ImportUtils] (ServerService Thread Pool -- 56) Realm 'che' already exists. Import skipped
09:08:00,736 INFO  [org.keycloak.services] (ServerService Thread Pool -- 56) KC-SERVICES0032: Import finished successfully
09:08:00,736 INFO  [org.keycloak.services] (ServerService Thread Pool -- 56) KC-SERVICES0006: Importing users from '/opt/jboss/keycloak/standalone/configuration/keycloak-add-user.json'
09:08:00,772 WARN  [org.hibernate.engine.jdbc.spi.SqlExceptionHelper] (ServerService Thread Pool -- 56) SQL Error: 23505, SQLState: 23505
09:08:00,773 ERROR [org.hibernate.engine.jdbc.spi.SqlExceptionHelper] (ServerService Thread Pool -- 56) Unique index or primary key violation: "UK_RU8TT6T700S9V50BU18WS5HA6_INDEX_B ON PUBLIC.USER_ENTITY(REALM_ID, USERNAME) VALUES ('master', 'admin', 4)"; SQL statement:
insert into USER_ENTITY (CREATED_TIMESTAMP, EMAIL, EMAIL_CONSTRAINT, EMAIL_VERIFIED, ENABLED, FEDERATION_LINK, FIRST_NAME, LAST_NAME, NOT_BEFORE, REALM_ID, SERVICE_ACCOUNT_CLIENT_LINK, USERNAME, ID) values (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?) [23505-193]
09:08:00,775 INFO  [org.hibernate.engine.jdbc.batch.internal.AbstractBatchImpl] (ServerService Thread Pool -- 56) HHH000010: On release of batch it still contained JDBC statements
09:08:00,780 ERROR [org.keycloak.services] (ServerService Thread Pool -- 56) KC-SERVICES0010: Failed to add user 'admin' to realm 'master': user with username exists
09:08:00,837 INFO  [org.jboss.resteasy.resteasy_jaxrs.i18n] (ServerService Thread Pool -- 56) RESTEASY002225: Deploying javax.ws.rs.core.Application: class org.keycloak.services.resources.KeycloakApplication

Is there keycloak wrong?

lf1029698952 commented 6 years ago

websocket connect failed because:Keycloak initialization failed with error: Error loading script. HTTP Authentication failed; no valid credentials available, how to fix it @sleshchenko @eivantsov Thanks!

I got it, it is because auth key-cloak url problem, #8846

lf1029698952 commented 6 years ago

workspace cant start still exist, How should we solve this problem?

Failed to start Kubernetes runtime of workspace workspace89l1ds8lvamej0ut. Cause: null

my che env configmap:

apiVersion: v1
data:
  CHE_API: http://che-eclipse.zhubajie.la/api
  CHE_DEBUG_SERVER: "true"
  CHE_HOST: che-eclipse.zhubajie.la
  CHE_INFRA_KUBERNETES_BOOTSTRAPPER_BINARY__URL: http://che-eclipse.zhubajie.la/agent-binaries/linux_amd64/bootstrapper/bootstrapper
  CHE_INFRA_KUBERNETES_INGRESS_ANNOTATIONS__JSON: '{"kubernetes.io/ingress.class":
    "nginx", "nginx.ingress.kubernetes.io/rewrite-target": "/","nginx.ingress.kubernetes.io/ssl-redirect":
    "false","nginx.ingress.kubernetes.io/proxy-connect-timeout": "3600","nginx.ingress.kubernetes.io/proxy-read-timeout":
    "3600"}'
  CHE_INFRA_KUBERNETES_INGRESS_DOMAIN: zhubajie.la
  CHE_INFRA_KUBERNETES_MACHINE__START__TIMEOUT__MIN: "5"
  CHE_INFRA_KUBERNETES_MASTER__URL: ""
  CHE_INFRA_KUBERNETES_NAMESPACE: eclipse
  CHE_INFRA_KUBERNETES_POD_SECURITY__CONTEXT_FS__GROUP: "0"
  CHE_INFRA_KUBERNETES_POD_SECURITY__CONTEXT_RUN__AS__USER: "0"
  CHE_INFRA_KUBERNETES_PVC_PRECREATE__SUBPATHS: "false"
  CHE_INFRA_KUBERNETES_PVC_QUANTITY: 10Gi
  CHE_INFRA_KUBERNETES_PVC_STRATEGY: unique
  CHE_INFRA_KUBERNETES_SERVER__STRATEGY: multi-host
  CHE_INFRA_KUBERNETES_TLS__ENABLED: "false"
  CHE_INFRA_KUBERNETES_TLS__SECRET: ""
  CHE_INFRA_KUBERNETES_TRUST__CERTS: "false"
  CHE_INFRASTRUCTURE_ACTIVE: kubernetes
  CHE_KEYCLOAK_AUTH__SERVER__URL: http://keycloak-eclipse.zhubajie.la/auth
  CHE_KEYCLOAK_CLIENT__ID: che-public
  CHE_KEYCLOAK_REALM: che
  CHE_LIMITS_WORKSPACE_IDLE_TIMEOUT: "-1"
  CHE_LOCAL_CONF_DIR: /etc/conf
  CHE_LOG_LEVEL: INFO
  CHE_LOGS_APPENDERS_IMPL: plaintext
  CHE_LOGS_DIR: /data/logs
  CHE_MULTIUSER: "true"
  CHE_OAUTH_GITHUB_CLIENTID: ""
  CHE_OAUTH_GITHUB_CLIENTSECRET: ""
  CHE_PORT: "8080"
  CHE_PREDEFINED_STACKS_RELOAD__ON__START: "false"
  CHE_WEBSOCKET_ENDPOINT: ws://che-eclipse.zhubajie.la/api/websocket
  CHE_WORKSPACE_AUTO_START: "false"
  CHE_WORKSPACE_HTTP__PROXY: ""
  CHE_WORKSPACE_HTTPS__PROXY: ""
  CHE_WORKSPACE_NO__PROXY: ""
  JAVA_OPTS: '-XX:MaxRAMFraction=2 -XX:+UseParallelGC -XX:MinHeapFreeRatio=10 -XX:MaxHeapFreeRatio=20
    -XX:GCTimeRatio=4 -XX:AdaptiveSizePolicyWeight=90 -XX:+UnlockExperimentalVMOptions
    -XX:+UseCGroupMemoryLimitForHeap -Dsun.zip.disableMemoryMapping=true -Xms20m '
kind: ConfigMap
metadata:
  creationTimestamp: 2018-09-11T08:38:46Z
  labels:
    app: che
  name: che
  namespace: eclipse
sleshchenko commented 6 years ago

workspace cant start still exist, How should we solve this problem? Failed to start Kubernetes runtime of workspace workspace89l1ds8lvamej0ut. Cause: null

Is that all about an error in Che Server Logs?

ghost commented 6 years ago

Keycloak definitely does not look healthy

Caused by: java.io.IOException: User limit of inotify instances reached or too many open files

And you increase this limit on a host?

Also, can you try to deploy a single user flavor of Che?

lf1029698952 commented 6 years ago

yes, my kubernetes cluster in private region, doesn't exsits elb, so I expose 9000 port by NodePort, and my domain name use bind hosts file instead of dns. so I suspect it is caused by port number or dns resolve. my che server logs:

2018-09-13 09:44:17,570[nio-8080-exec-9]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 99]   - Processing messing within unidentified session
2018-09-13 09:44:27,569[nio-8080-exec-1]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 99]   - Processing messing within unidentified session
2018-09-13 09:44:37,568[nio-8080-exec-5]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 99]   - Processing messing within unidentified session
2018-09-13 09:44:47,569[io-8080-exec-10]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 99]   - Processing messing within unidentified session
2018-09-13 09:44:55,182[nio-8080-exec-1]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 119]  - Closing unidentified session
2018-09-13 09:44:55,199[nio-8080-exec-6]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 119]  - Closing unidentified session
2018-09-13 09:44:57,230[nio-8080-exec-3]  [INFO ] [o.e.c.a.w.s.WorkspaceRuntimes 330]   - Starting workspace 'admin/java-test' with id 'workspacel0iewbh6cq4g847f' by user 'admin'
2018-09-13 09:44:58,105[nio-8080-exec-9]  [INFO ] [o.e.c.a.w.s.WorkspaceRuntimes 330]   - Starting workspace 'admin/wss' with id 'workspace89l1ds8lvamej0ut' by user 'admin'
2018-09-13 09:45:22,265[default.svc/...]  [ERROR] [.w.i.k.n.KubernetesDeployments 469]  - Failed to parse last timestamp of the event: PodEvent{podName='workspacel0iewbh6cq4g847f.dockerimage-3790641075-bgcpc', containerName='null', reason='SuccessfulMountVolume', message='MountVolume.SetUp succeeded for volume "default-token-j290p" ', creationTimestamp='2018-09-13T09:45:22Z', lastTimestamp='2018-09-13T09:45:22Z'}
2018-09-13 09:45:22,265[default.svc/...]  [ERROR] [.w.i.k.n.KubernetesDeployments 469]  - Failed to parse last timestamp of the event: PodEvent{podName='workspacel0iewbh6cq4g847f.dockerimage-3790641075-bgcpc', containerName='null', reason='SuccessfulMountVolume', message='MountVolume.SetUp succeeded for volume "default-token-j290p" ', creationTimestamp='2018-09-13T09:45:22Z', lastTimestamp='2018-09-13T09:45:22Z'}
2018-09-13 09:45:22,603[aceSharedPool-0]  [WARN ] [.i.k.KubernetesInternalRuntime 194]  - Failed to start Kubernetes runtime of workspace workspaceje1hta9b3fkojzr5. Cause: null
2018-09-13 09:45:23,024[aceSharedPool-0]  [WARN ] [i.f.k.c.i.VersionUsageUtils 55]      - The client is using resource type 'replicasets' with unstable version 'v1beta1'
2018-09-13 09:45:23,453[aceSharedPool-0]  [INFO ] [o.e.c.a.w.s.WorkspaceRuntimes 391]   - Workspace 'admin:java-workspace' with id 'workspaceje1hta9b3fkojzr5' start failed
2018-09-13 09:45:23,458[aceSharedPool-0]  [ERROR] [o.e.c.a.w.s.WorkspaceRuntimes 401]   - null
org.eclipse.che.api.workspace.server.spi.InternalInfrastructureException: null
    at org.eclipse.che.workspace.infrastructure.kubernetes.StartSynchronizer.getStartFailureNow(StartSynchronizer.java:275)
    at org.eclipse.che.workspace.infrastructure.kubernetes.KubernetesInternalRuntime.internalStart(KubernetesInternalRuntime.java:189)
    at org.eclipse.che.api.workspace.server.spi.InternalRuntime.start(InternalRuntime.java:146)
    at org.eclipse.che.api.workspace.server.WorkspaceRuntimes$StartRuntimeTask.run(WorkspaceRuntimes.java:367)
    at org.eclipse.che.commons.lang.concurrent.CopyThreadLocalRunnable.run(CopyThreadLocalRunnable.java:38)
    at java.util.concurrent.CompletableFuture$AsyncRun.run(CompletableFuture.java:1626)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.concurrent.TimeoutException: null
    at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1771)
    at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915)
    at org.eclipse.che.workspace.infrastructure.kubernetes.KubernetesInternalRuntime.waitMachines(KubernetesInternalRuntime.java:254)
    at org.eclipse.che.workspace.infrastructure.kubernetes.KubernetesInternalRuntime.internalStart(KubernetesInternalRuntime.java:186)
    ... 7 common frames omitted
2018-09-13 09:45:24,718[default.svc/...]  [ERROR] [.w.i.k.n.KubernetesDeployments 469]  - Failed to parse last timestamp of the event: PodEvent{podName='workspacel0iewbh6cq4g847f.dockerimage-3790641075-bgcpc', containerName='container', reason='Pulling', message='pulling image "hub.zhubajie.la/eclipse/ubuntu_jdk8"', creationTimestamp='2018-09-13T09:45:24Z', lastTimestamp='2018-09-13T09:45:24Z'}
2018-09-13 09:45:24,718[default.svc/...]  [ERROR] [.w.i.k.n.KubernetesDeployments 469]  - Failed to parse last timestamp of the event: PodEvent{podName='workspacel0iewbh6cq4g847f.dockerimage-3790641075-bgcpc', containerName='container', reason='Pulling', message='pulling image "hub.zhubajie.la/eclipse/ubuntu_jdk8"', creationTimestamp='2018-09-13T09:45:24Z', lastTimestamp='2018-09-13T09:45:24Z'}
2018-09-13 09:45:25,118[nio-8080-exec-9]  [INFO ] [o.e.c.a.w.s.WorkspaceRuntimes 330]   - Starting workspace 'admin/java-workspace' with id 'workspaceje1hta9b3fkojzr5' by user 'admin'
2018-09-13 09:45:25,160[aceSharedPool-0]  [WARN ] [o.e.c.a.w.s.WorkspaceManager 433]    - Cannot set error status of the workspace workspaceje1hta9b3fkojzr5. Error is: null
2018-09-13 09:45:26,477[default.svc/...]  [ERROR] [.w.i.k.n.KubernetesDeployments 469]  - Failed to parse last timestamp of the event: PodEvent{podName='workspacel0iewbh6cq4g847f.dockerimage-3790641075-bgcpc', containerName='container', reason='Pulled', message='Successfully pulled image "hub.zhubajie.la/eclipse/ubuntu_jdk8"', creationTimestamp='2018-09-13T09:45:26Z', lastTimestamp='2018-09-13T09:45:26Z'}
2018-09-13 09:45:26,477[default.svc/...]  [ERROR] [.w.i.k.n.KubernetesDeployments 469]  - Failed to parse last timestamp of the event: PodEvent{podName='workspacel0iewbh6cq4g847f.dockerimage-3790641075-bgcpc', containerName='container', reason='Pulled', message='Successfully pulled image "hub.zhubajie.la/eclipse/ubuntu_jdk8"', creationTimestamp='2018-09-13T09:45:26Z', lastTimestamp='2018-09-13T09:45:26Z'}
2018-09-13 09:45:49,299[nio-8080-exec-8]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 99]   - Processing messing within unidentified session
2018-09-13 09:45:59,662[nio-8080-exec-4]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 99]   - Processing messing within unidentified session
2018-09-13 09:46:10,058[nio-8080-exec-3]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 99]   - Processing messing within unidentified session
2018-09-13 09:46:19,299[nio-8080-exec-9]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 99]   - Processing messing within unidentified session
2018-09-13 09:46:28,560[io-8080-exec-10]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 99]   - Processing messing within unidentified session
2018-09-13 09:46:38,559[nio-8080-exec-1]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 99]   - Processing messing within unidentified session
2018-09-13 09:46:48,558[nio-8080-exec-6]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 99]   - Processing messing within unidentified session
2018-09-13 09:46:58,560[nio-8080-exec-2]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 99]   - Processing messing within unidentified session
2018-09-13 09:47:03,298[nio-8080-exec-1]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 135]  - Web socket session error
2018-09-13 09:47:03,302[nio-8080-exec-1]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 119]  - Closing unidentified session
2018-09-13 09:47:03,303[nio-8080-exec-7]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 135]  - Web socket session error
2018-09-13 09:47:03,304[nio-8080-exec-7]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 119]  - Closing unidentified session
2018-09-13 09:47:09,304[nio-8080-exec-6]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 119]  - Closing unidentified session
2018-09-13 09:47:09,306[nio-8080-exec-5]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 119]  - Closing unidentified session
2018-09-13 09:47:09,306[io-8080-exec-10]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 119]  - Closing unidentified session
2018-09-13 09:47:09,306[nio-8080-exec-1]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 119]  - Closing unidentified session
2018-09-13 09:47:09,308[nio-8080-exec-7]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 119]  - Closing unidentified session
2018-09-13 09:47:09,311[nio-8080-exec-4]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 119]  - Closing unidentified session
2018-09-13 09:49:21,354[default.svc/...]  [ERROR] [.w.i.k.n.KubernetesDeployments 469]  - Failed to parse last timestamp of the event: PodEvent{podName='workspace89l1ds8lvamej0ut.dockerimage-2945920964-n9xc9', containerName='container', reason='Pulling', message='pulling image "hub.zhubajie.la/eclipse/ubuntu_jdk8"', creationTimestamp='2018-09-13T09:49:21Z', lastTimestamp='2018-09-13T09:46:01Z'}
2018-09-13 09:49:21,354[default.svc/...]  [ERROR] [.w.i.k.n.KubernetesDeployments 469]  - Failed to parse last timestamp of the event: PodEvent{podName='workspace89l1ds8lvamej0ut.dockerimage-2945920964-n9xc9', containerName='container', reason='Pulling', message='pulling image "hub.zhubajie.la/eclipse/ubuntu_jdk8"', creationTimestamp='2018-09-13T09:49:21Z', lastTimestamp='2018-09-13T09:46:01Z'}
2018-09-13 09:49:59,554[default.svc/...]  [ERROR] [.w.i.k.n.KubernetesDeployments 469]  - Failed to parse last timestamp of the event: PodEvent{podName='workspace89l1ds8lvamej0ut.dockerimage-2945920964-n9xc9', containerName='container', reason='Pulled', message='Successfully pulled image "hub.zhubajie.la/eclipse/ubuntu_jdk8"', creationTimestamp='2018-09-13T09:49:59Z', lastTimestamp='2018-09-13T09:46:39Z'}
sleshchenko commented 6 years ago

Error messages are strange. Maybe workspace is failed because of error while parsing timestamp, but the event looks just fine. @ibuziuk Maybe you have an idea what is wrong with it? BTW I've added logging original exception along with the event, it would be much clear what is wrong with a timestamp after merging my next PR.

lf1029698952 commented 6 years ago

image and che logs with many session error:

2018-09-19 07:30:33,534[nio-8080-exec-9]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 99]   - Processing messing within unidentified session
2018-09-19 07:30:43,530[nio-8080-exec-1]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 99]   - Processing messing within unidentified session
2018-09-19 07:30:53,537[nio-8080-exec-7]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 99]   - Processing messing within unidentified session
2018-09-19 07:31:03,531[nio-8080-exec-6]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 99]   - Processing messing within unidentified session
2018-09-19 07:31:13,529[nio-8080-exec-4]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 99]   - Processing messing within unidentified session
2018-09-19 07:31:23,532[nio-8080-exec-8]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 99]   - Processing messing within unidentified session
2018-09-19 07:31:33,531[nio-8080-exec-9]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 99]   - Processing messing within unidentified session
2018-09-19 07:31:43,813[io-8080-exec-10]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 99]   - Processing messing within unidentified session
2018-09-19 07:31:53,812[nio-8080-exec-5]  [WARN ] [a.c.w.i.BasicWebSocketEndpoint 99]   - Processing messing within unidentified session

it looks wesocket connection error?

lf1029698952 commented 6 years ago

workspace pod:

root@workspacel0iewbh6cq4g847f:/projects# ps ax
    PID TTY      STAT   TIME COMMAND
      1 ?        Ss     0:00 /pause
      7 ?        Ss     0:00 /bin/sh -c tail -f /dev/null
     16 ?        S      0:00 /usr/sbin/sshd -D
     17 ?        S      0:00 tail -f /dev/null
    103 pts/0    Ss     0:00 bash
    126 pts/0    R+     0:00 ps ax

root@workspacel0iewbh6cq4g847f:/workspace_logs/bootstrapper# cat bootstrapper.log
/tmp/bootstrapper/bootstrapper: 1: /tmp/bootstrapper/bootstrapper: cannot open !DOCTYPE: No such file
/tmp/bootstrapper/bootstrapper: 1: /tmp/bootstrapper/bootstrapper: HTML: not found
/tmp/bootstrapper/bootstrapper: 2: /tmp/bootstrapper/bootstrapper: cannot open html: No such file
/tmp/bootstrapper/bootstrapper: 3: /tmp/bootstrapper/bootstrapper: Syntax error: redirection unexpected
ghost commented 6 years ago

@lf1029698952

What's the URL you use to access Che?

http://keycloak-eclipse.zhubajie.la or http://keycloak-eclipse.zhubajie.la:9000?

If it is the latter, update your configmap.

lf1029698952 commented 6 years ago

I have add haproxy as elb to proxy 9000 port, so my url is http://keycloak-eclipse.zhubajie.la

ghost commented 6 years ago

But in your configmap you have two kinds of URL

lf1029698952 commented 6 years ago

sorry, I have updated my env configmap, latest env configmap:

apiVersion: v1
data:
  CHE_API: http://che-eclipse.zhubajie.la/api
  CHE_DEBUG_SERVER: "true"
  CHE_HOST: che-eclipse.zhubajie.la
  CHE_INFRA_KUBERNETES_BOOTSTRAPPER_BINARY__URL: http://che-eclipse.zhubajie.la/agent-binaries/linux_amd64/bootstrapper/bootstrapper
  CHE_INFRA_KUBERNETES_INGRESS_ANNOTATIONS__JSON: '{"kubernetes.io/ingress.class":
    "nginx", "nginx.ingress.kubernetes.io/rewrite-target": "/","nginx.ingress.kubernetes.io/ssl-redirect":
    "false","nginx.ingress.kubernetes.io/proxy-connect-timeout": "3600","nginx.ingress.kubernetes.io/proxy-read-timeout":
    "3600"}'
  CHE_INFRA_KUBERNETES_INGRESS_DOMAIN: zhubajie.la
  CHE_INFRA_KUBERNETES_MACHINE__START__TIMEOUT__MIN: "5"
  CHE_INFRA_KUBERNETES_MASTER__URL: ""
  CHE_INFRA_KUBERNETES_NAMESPACE: eclipse
  CHE_INFRA_KUBERNETES_POD_SECURITY__CONTEXT_FS__GROUP: "0"
  CHE_INFRA_KUBERNETES_POD_SECURITY__CONTEXT_RUN__AS__USER: "0"
  CHE_INFRA_KUBERNETES_PVC_PRECREATE__SUBPATHS: "false"
  CHE_INFRA_KUBERNETES_PVC_QUANTITY: 10Gi
  CHE_INFRA_KUBERNETES_PVC_STRATEGY: unique
  CHE_INFRA_KUBERNETES_SERVER__STRATEGY: multi-host
  CHE_INFRA_KUBERNETES_TLS__ENABLED: "false"
  CHE_INFRA_KUBERNETES_TLS__SECRET: ""
  CHE_INFRA_KUBERNETES_TRUST__CERTS: "false"
  CHE_INFRASTRUCTURE_ACTIVE: kubernetes
  CHE_KEYCLOAK_AUTH__SERVER__URL: http://keycloak-eclipse.zhubajie.la/auth
  CHE_KEYCLOAK_CLIENT__ID: che-public
  CHE_KEYCLOAK_REALM: che
  CHE_LIMITS_WORKSPACE_IDLE_TIMEOUT: "-1"
  CHE_LOCAL_CONF_DIR: /etc/conf
  CHE_LOG_LEVEL: INFO
  CHE_LOGS_APPENDERS_IMPL: plaintext
  CHE_LOGS_DIR: /data/logs
  CHE_MULTIUSER: "true"
  CHE_OAUTH_GITHUB_CLIENTID: ""
  CHE_OAUTH_GITHUB_CLIENTSECRET: ""
  CHE_PORT: "8080"
  CHE_PREDEFINED_STACKS_RELOAD__ON__START: "false"
  CHE_WEBSOCKET_ENDPOINT: ws://che-eclipse.zhubajie.la/api/websocket
  CHE_WORKSPACE_AUTO_START: "false"
  CHE_WORKSPACE_HTTP__PROXY: ""
  CHE_WORKSPACE_HTTPS__PROXY: ""
  CHE_WORKSPACE_NO__PROXY: ""
  JAVA_OPTS: '-XX:MaxRAMFraction=2 -XX:+UseParallelGC -XX:MinHeapFreeRatio=10 -XX:MaxHeapFreeRatio=20
    -XX:GCTimeRatio=4 -XX:AdaptiveSizePolicyWeight=90 -XX:+UnlockExperimentalVMOptions
    -XX:+UseCGroupMemoryLimitForHeap -Dsun.zip.disableMemoryMapping=true -Xms20m '
kind: ConfigMap
metadata:
  creationTimestamp: 2018-09-11T08:38:46Z
  labels:
    app: che
  name: che
  namespace: eclipse

and che-eclipse.zhubajie.la、keycloak-eclipse.zhubajie.la are resolved to my elb ip

ghost commented 6 years ago

Since workspace container is created and bootstrapper is curled and it runs.. can you check bootstrapper logs once again with this new correct configmap?

lf1029698952 commented 6 years ago

thanks. I use "kubectl logs 'workspace pod'", there is nothing output. and exec workspace pod:

root@workspaceje1hta9b3fkojzr5:/workspace_logs/bootstrapper# cat bootstrapper.log
/tmp/bootstrapper/bootstrapper: 1: /tmp/bootstrapper/bootstrapper: cannot open !DOCTYPE: No such file
/tmp/bootstrapper/bootstrapper: 1: /tmp/bootstrapper/bootstrapper: HTML: not found
/tmp/bootstrapper/bootstrapper: 2: /tmp/bootstrapper/bootstrapper: cannot open html: No such file
/tmp/bootstrapper/bootstrapper: 3: /tmp/bootstrapper/bootstrapper: Syntax error: redirection unexpected
root@workspaceje1hta9b3fkojzr5:/workspace_logs/bootstrapper# ps -ef
UID          PID    PPID  C STIME TTY          TIME CMD
root           1       0  0 09:52 ?        00:00:00 /pause
root           7       0  0 09:52 ?        00:00:00 /bin/sh -c tail -f /dev/null
root          16       7  0 09:52 ?        00:00:00 /usr/sbin/sshd -D
root          17       7  0 09:52 ?        00:00:00 tail -f /dev/null
root         103       0  0 09:53 pts/0    00:00:00 bash
root         131     103  0 09:54 pts/0    00:00:00 ps -ef

kubectl get event -n eclipse:

  deployment-controller      Scaled down replica set workspace7uho6zrlg6rzxf5j.dockerimage-1236242367 to 0
8m          8m           1         workspaceje1hta9b3fkojzr5.dockerimage-657533721-7d6bx.1555c45c28e5cc27    Pod                                       Normal    Scheduled               default-scheduler          Successfully assigned workspaceje1hta9b3fkojzr5.dockerimage-657533721-7d6bx to 172.20.40.103
8m          8m           1         workspaceje1hta9b3fkojzr5.dockerimage-657533721-7d6bx.1555c45dab442fbd    Pod                                       Normal    SuccessfulMountVolume   kubelet, 172.20.40.103     MountVolume.SetUp succeeded for volume "default-token-j290p" 
8m          8m           1         workspaceje1hta9b3fkojzr5.dockerimage-657533721-7d6bx.1555c45e925f30ab    Pod                                       Normal    SuccessfulMountVolume   kubelet, 172.20.40.103     MountVolume.SetUp succeeded for volume "pvc-851925b6-b738-11e8-a3bc-fa163ea31113" 
8m          8m           2         workspaceje1hta9b3fkojzr5.dockerimage-657533721-7d6bx.1555c45f24fb3c90    Pod                                       Normal    SuccessfulMountVolume   kubelet, 172.20.40.103     MountVolume.SetUp succeeded for volume "pvc-8516b5ad-b738-11e8-a3bc-fa163ea31113" 
8m          8m           1         workspaceje1hta9b3fkojzr5.dockerimage-657533721-7d6bx.1555c45fe606859a    Pod          spec.containers{container}   Normal    Pulling                 kubelet, 172.20.40.103     pulling image "hub.zhubajie.la/eclipse/ubuntu_jdk8"
8m          8m           1         workspaceje1hta9b3fkojzr5.dockerimage-657533721-7d6bx.1555c4604a49158c    Pod          spec.containers{container}   Normal    Pulled                  kubelet, 172.20.40.103     Successfully pulled image "hub.zhubajie.la/eclipse/ubuntu_jdk8"
8m          8m           1         workspaceje1hta9b3fkojzr5.dockerimage-657533721-7d6bx.1555c460548c8ea9    Pod          spec.containers{container}   Normal    Created                 kubelet, 172.20.40.103     Created container
8m          8m           1         workspaceje1hta9b3fkojzr5.dockerimage-657533721-7d6bx.1555c46075a03c0a    Pod          spec.containers{container}   Normal    Started                 kubelet, 172.20.40.103     Started container
8m          8m           1         workspaceje1hta9b3fkojzr5.dockerimage-657533721.1555c45c28a332e8          ReplicaSet                                Normal    SuccessfulCreate        replicaset-controller      Created pod: workspaceje1hta9b3fkojzr5.dockerimage-657533721-7d6bx
48s         48s          1         workspaceje1hta9b3fkojzr5.dockerimage-657533721.1555c4cc06267363          ReplicaSet                                Normal    SuccessfulDelete        replicaset-controller      Deleted pod: workspaceje1hta9b3fkojzr5.dockerimage-657533721-7d6bx

and chrome console output:

java-test:1 Uncaught (in promise) undefined
Promise.then (async)
e.connect @ che-json-rpc-master-api.ts:122
(anonymous) @ che-json-rpc-master-api.ts:51
(anonymous) @ websocket-client.ts:107
e.callHandlers @ websocket-client.ts:107
(anonymous) @ websocket-client.ts:51
index.js:123 WebSocket connection to 'ws://che-eclipse.zhubajie.la/api/websocket?token=eyJhbGciOiJSUzI1NiIsInR5cCIgOiAiSldUIiwia2lkIiA6ICJlMjNGc3kzRlI5dnRUZms3TGlkX1lQOGU0cDNoY0psM20wQTRnckIzNnJJIn0.eyJqdGkiOiJkM2VhYTU5Ni1lNGQ2LTRmNWMtOGVlNS0wMThhMmYwOTMwNTEiLCJleHAiOjE1MzczNTExNzEsIm5iZiI6MCwiaWF0IjoxNTM3MzUwODcxLCJpc3MiOiJodHRwOi8va2V5Y2xvYWstZWNsaXBzZS56aHViYWppZS5sYS9hdXRoL3JlYWxtcy9jaGUiLCJhdWQiOiJjaGUtcHVibGljIiwic3ViIjoiYjA3ZTNhNTgtZWQ1MC00YTZlLWJlMTctZmNmNDlmZjhiMjQyIiwidHlwIjoiQmVhcmVyIiwiYXpwIjoiY2hlLXB1YmxpYyIsIm5vbmNlIjoiZmUzMTUzM2...FkbWluIEFkbWluIiwicHJlZmVycmVkX3VzZXJuYW1lIjoiYWRtaW4iLCJnaXZlbl9uYW1lIjoiQWRtaW4iLCJmYW1pbHlfbmFtZSI6IkFkbWluIiwiZW1haWwiOiJhZG1pbkBhZG1pbi5jb20ifQ.Bsmev9ZbIDorlJtYVOsAtQvLNZXnd_rj2cb1cl5xVJrR9CUH1k7jgfysOr4pGo1Anyu2xP_wvOMZuF7SjVdzB7B76kP-A3n1kULF0BlWiIK8PeHCMtVp43HbM4Ta-_x56_tHiHPSVYVWIaDOch1JK3LUooZGRffAEA0gnep1cd-H_tDaOIyxO1y8q9qnL2f4nvEQAYNk-0frLrFs6Kwhv6tWOIOUa8Y8psT-YsOhBxVo0f9Yxjju4geSe5vLfYiE0mwKOyGyX2gCmu_U7J3ovbm7kjmL-BXrYZkCiVF-UOtszDzbsJySRAOxTtgmagjcx0smNrZS_ABD3p5MJIFQOA&clientId=871608858' failed: HTTP Authentication failed; no valid credentials available

my keycloak che-public settings: image

ghost commented 6 years ago

@lf1029698952 bootstrapper definitely fails to start but I have no idea why - the log is pretty weird.

lf1029698952 commented 6 years ago

Thanks for your answer, I will change a cluster and reinstall.

ramene commented 6 years ago

@lf1029698952 I've been fighting a similar issue deploying against AWS with EKS. Running into the same issues with websockets and learned it ties back to the ELB. It's interesting you're still running into the issues after implementing HAProxy as I had the same idea but had to walk away...

With respect to Web Origins in keycloak, I found that adding a single wildcard asterisk * got me past the Authorization token is missed error

screen shot 2018-09-20 at 3 09 00 pm

A good thread to follow with respect to the websockets is: https://github.com/eclipse/che/issues/10300 and make note of this comment: https://github.com/eclipse/che/issues/10300#issuecomment-403640270

At the end of the day, with the websocket issue still persisting I was able to get the workspace(s) started albeit all the functionality was lost IF I removed the embedded terminal before starting the workspace. All the other extensions seemingly work, but are all contingent on the embedded terminal and communicating via ws://

screen shot 2018-09-11 at 3 58 52 am

AFTER starting the workspace without the embedded terminal and then going back in and re-enabling the embedded terminal ... the workspace will restart successfully but then interestingly enough I see this after the IDE opens

screen shot 2018-09-11 at 2 19 31 am

I'll circle back after I get past this last issue; albeit there is more work to follow with respect to compose if you want to use the multi-machine stacks in CHE.

@lf1029698952 - It might be worth joining the mattermost channel too: https://mattermost.eclipse.org/eclipse/channels/eclipse-che

lf1029698952 commented 6 years ago

@ramene Thanks for your reply, image

java-test:1 Uncaught (in promise) undefined
Promise.then (async)
e.connect @ che-json-rpc-master-api.ts:122
(anonymous) @ che-json-rpc-master-api.ts:51
(anonymous) @ websocket-client.ts:107
e.callHandlers @ websocket-client.ts:107
(anonymous) @ websocket-client.ts:51
index.js:123 WebSocket connection to 'ws://che-eclipse.zhubajie.la/api/websocket?token=.....Bsmev9ZbIDorlJtYVOsAtQvLNZXnd_rj2cb1cl5xVJrR9CUH1k7jgfysOr4pGo1Anyu2xP_wvOMZuF7SjVdzB7B76kP-A&clientId=871608858' failed: HTTP Authentication failed; no valid credentials available

It looks websocket authentication issue, I modify web origins URL "*", but no solution.

lf1029698952 commented 6 years ago

@eivantsov Hello, I find che-server access log have many websocket 401 code, How to config keycloak in mutil-user? Thanks very much. image

lf1029698952 commented 6 years ago

when I deploy che 6.13-SNAPSHOT in kubernetes v1.9.7

Error: Failed to run the workspace: "Server 'exec-agent/http' in machine 'dev-machine' not available."

image

sleshchenko commented 6 years ago

@lf1029698952 This error means that wsagent tomcat failed to start of it is unavailable for Che Server. Could you check workspace agents logs in /workspace_logs/ws-agent/logs/catalina.log?

ghost commented 6 years ago

@sleshchenko that's exec agent but it does not matter.

@lf1029698952 do you see ingresses created when a workspace starts? There's some sort of a connectivity issue I think. If all processes start ok in a workspace (you may exec into ws pod and run ps ax there) it's Che master that cannot verify an agent is up. Make sure ingresses can bereached from within pods

lf1029698952 commented 6 years ago

Thanks for your reply It does websocket connection to che server problem, In my mutil-host, workspace will create five ingress, I must resolve these ingress host to my elb. and my workspace successful running. image but, I encountered the same problem with @ramene , Have you solved this problem? image image

lf1029698952 commented 6 years ago

Thanks very much, I have solved this problem. when I change mutil-host to single-host, the Ingress only one domain, and I resolved it to my elb, and websocket connection successful. I think mutil-host many ingress domain, the problem is in nginx ingress dynamic add domain configuration and forward.