eclipse-che / che

Kubernetes based Cloud Development Environments for Enterprise Teams
http://eclipse.org/che
Eclipse Public License 2.0
6.99k stars 1.19k forks source link

Error: Failed to start the workspace: "Unknown error" #13197

Closed lysannef closed 5 years ago

lysannef commented 5 years ago

Description

Unable to create workspace on che/dashboard. Encountering error : Error: Failed to start the workspace: "Unknown error"

Reproduction Steps

OS and version:
arch: ppc64le os: ubuntu:16.04 image build environment: docker container

Diagnostics/Logs:

2019-04-22 09:31:36,154[nio-8080-exec-9]  [ERROR] [o.a.c.c.C.[.[.[/api].[default] 236]  - Servlet.service() for servlet [default] in context with path [/api] threw exception [java.lang.NoClassDefFoundError: Could not initialize class org.eclipse.che.infrastructure.docker.client.CLibraryFactory] with root cause
java.lang.NoClassDefFoundError: Could not initialize class org.eclipse.che.infrastructure.docker.client.CLibraryFactory
        at org.eclipse.che.infrastructure.docker.client.connection.UnixSocketConnection.connect(UnixSocketConnection.java:66)
        at org.eclipse.che.infrastructure.docker.client.connection.UnixSocketConnection.request(UnixSocketConnection.java:49)
        at org.eclipse.che.infrastructure.docker.client.connection.DockerConnection.request(DockerConnection.java:92)
        at org.eclipse.che.infrastructure.docker.client.DockerConnector.listContainers(DockerConnector.java:284)
        at org.eclipse.che.workspace.infrastructure.docker.container.DockerContainers.listNonStoppedContainers(DockerContainers.java:74)
        at org.eclipse.che.workspace.infrastructure.docker.container.DockerContainers.find(DockerContainers.java:47)
        at org.eclipse.che.workspace.infrastructure.docker.DockerRuntimeContext.getRuntime(DockerRuntimeContext.java:84)
        at org.eclipse.che.workspace.infrastructure.docker.DockerRuntimeContext.getRuntime(DockerRuntimeContext.java:38)
        at org.eclipse.che.api.workspace.server.WorkspaceRuntimes.startAsync(WorkspaceRuntimes.java:370)
        at org.eclipse.che.api.workspace.server.WorkspaceManager.startAsync(WorkspaceManager.java:377)
        at org.eclipse.che.api.workspace.server.WorkspaceManager.startWorkspace(WorkspaceManager.java:307)
        at org.eclipse.che.api.workspace.server.WorkspaceService.startById(WorkspaceService.java:349)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.everrest.core.impl.method.DefaultMethodInvoker.invokeMethod(DefaultMethodInvoker.java:140)
        at org.everrest.core.impl.method.DefaultMethodInvoker.invokeMethod(DefaultMethodInvoker.java:60)
        at org.everrest.core.impl.RequestDispatcher.doInvokeResource(RequestDispatcher.java:306)
        at org.everrest.core.impl.RequestDispatcher.invokeSubResourceMethod(RequestDispatcher.java:297)
        at org.everrest.core.impl.RequestDispatcher.dispatch(RequestDispatcher.java:233)
        at org.everrest.core.impl.RequestDispatcher.dispatch(RequestDispatcher.java:128)
        at org.everrest.core.impl.RequestHandlerImpl.handleRequest(RequestHandlerImpl.java:62)
        at org.everrest.core.impl.EverrestProcessor.process(EverrestProcessor.java:120)
        at org.everrest.core.servlet.EverrestServlet.service(EverrestServlet.java:61)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:742)
        at com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:290)
        at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:280)
        at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:184)
        at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:89)
        at org.eclipse.che.api.local.filters.EnvironmentInitializationFilter.doFilter(EnvironmentInitializationFilter.java:64)
        at org.eclipse.che.commons.logback.filter.RequestIdLoggerFilter.doFilter(RequestIdLoggerFilter.java:50)
        at org.apache.catalina.filters.CorsFilter.handleNonCORS(CorsFilter.java:364)
        at org.apache.catalina.filters.CorsFilter.doFilter(CorsFilter.java:170)
        at org.eclipse.che.api.core.cors.CheCorsFilter.doFilter(CheCorsFilter.java:58)
        at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:121)
        at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:133)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:193)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:166)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:198)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:96)
        at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:493)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:140)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:81)
        at ch.qos.logback.access.tomcat.LogbackValve.invoke(LogbackValve.java:256)
        at org.apache.catalina.valves.RemoteIpValve.invoke(RemoteIpValve.java:685)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:87)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:342)
        at org.apache.coyote.http11.Http11Processor.service(Http11Processor.java:800)
        at org.apache.coyote.AbstractProcessorLight.process(AbstractProcessorLight.java:66)
        at org.apache.coyote.AbstractProtocol$ConnectionHandler.process(AbstractProtocol.java:806)
        at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1498)
        at org.apache.tomcat.util.net.SocketProcessorBase.run(SocketProcessorBase.java:49)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
        at java.lang.Thread.run(Thread.java:748)
ghatwala commented 5 years ago

Please let us know if u have any pointers to share on this - @benoitf , @skabashnyuk , @TylerJewell ?

skabashnyuk commented 5 years ago

Hello, @ghatwala thank you for your report. From what I can see there are a couple of reasons that may cause this failure. First I'm not sure we ever tested che on arch: ppc64le. The second - docker become a legacy infrastructure at this moment in master. Can you reevaluate it on Kubernetes running on x86_64 ?

lysannef commented 5 years ago

@skabashnyuk We had completed the validation on x86 using the same docker run command and were able to get the same functionality (i.e. creating a workspace ) working on the dashboard . However, this fails on a ppc64le machine, and it was our intention to get eclipse/che working successfully functionality wise on ppc64le via docker deployment method.

The following are the changes made to the Dockerfile to be built and used on power:

-    CHE_IN_CONTAINER=true
+    CHE_IN_CONTAINER=true \
+    ARCH="`arch`"

 RUN echo "http://dl-4.alpinelinux.org/alpine/edge/community" >> /etc/apk/repositories && \
     apk add --update curl openssl sudo bash && \
-    curl -sSL "https://${DOCKER_BUCKET}/builds/Linux/x86_64/docker-${DOCKER_VERSION}" -o /usr/bin/docker && \
+    curl -sSL "https://${DOCKER_BUCKET}/builds/Linux/$ARCH/docker-${DOCKER_VERSION}" -o /usr/bin/docker && \
     chmod +x /usr/bin/docker && \
     echo "%root ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers && \
     rm -rf /tmp/* /var/cache/apk/*

Is there something we are missing in our changed dockerfile, that when added might resolve this issue. We would appreciate some pointers on the same.

skabashnyuk commented 5 years ago

@mmorhun AFAIK you have some experience in porting che on different to x86_64 architecture, maybe you can help in this case?

mmorhun commented 5 years ago

I did porting of Che 6 on ARM (RPi 3b) last summer. But I didn't look deeply at Che 7 from that perspective.

lysannef commented 5 years ago

@mmorhun We have tried it using the Che 6 version as well , on a ppc64le machine and are facing the same issue. PFA a few screenshots, we may have missed something you could catch:

We are creating a Default blank stack, which comes with an eclipse/ubuntu_jdk machine by default. When run, the workspace fails with the Failed to start the workspace:"Unknown error"

che-1 che-2 che-3

Trying a few things we thought the machine recipe might be the one creating the problem , hence we changed it to a power specific 'ppc64le/openjdk:8' image. However we are faced with the same error.

che-4 che-6

Also attaching the config for the workspace created:

{
  "defaultEnv": "default",
  "environments": {
    "default": {
      "machines": {
        "dev-machine": {
          "attributes": {
            "memoryLimitBytes": "2147483648"
          },
          "servers": {
            "tomcat8-debug": {
              "attributes": {},
              "port": "8000",
              "protocol": "http"
            },
            "tomcat8": {
              "attributes": {},
              "port": "8080",
              "protocol": "http"
            }
          },
          "volumes": {},
          "installers": [
            "org.eclipse.che.exec",
            "org.eclipse.che.terminal",
            "org.eclipse.che.ws-agent"
          ],
          "env": {}
        }
      },
      "recipe": {
        "type": "dockerimage",
        "content": "ppc64le/openjdk:8"
      }
    }
  },
  "projects": [],
  "name": "wksp-2d3e",
  "attributes": {},
  "commands": [],
  "links": []
}

Kindly let us know if there's anything we are missing.

mmorhun commented 5 years ago

@lysannef Che 6 has some binaries which are architecture specific: exec agent & terminal agent at least. They are written in go, so you need to recompile them for your architecture and include into Che master (replacing x86_64 binaries), so workspaces use them and not x86_64 version. Also all images which are used should have binaries for your architecture.

But I would recommend you to look at Che 7.

lysannef commented 5 years ago

@mmorhun We shifted to the Che 7 version and have generated the binaries for ppc64le architecture for the terminal agent and the exec-agent. Additionally have also compiled the code for the ws-agent as well. Replacing this newly generated binaries with the existing code and rebuilding the che-server image still throws the same error.

Also all images which are used should have binaries for your architecture.

Also about the image binaries you specified above, could we please get a little more clarification on the same. Is it binaries used while building the image?

Is it possible to share with us how you got the flow of what agents are being invoked, where and how they are being used in the image.

mmorhun commented 5 years ago

We shifted to the Che 7 version and have generated the binaries for ppc64le architecture for the terminal agent and the exec-agent.

Che 7 doesn't have agents at all. If you have them - you use Che 6 (Che 7 still backward compatible but with warnings that such support will be dropped soon).

Replacing this newly generated binaries with the existing code and rebuilding the che-server image still throws the same error.

Sorry, cannot understand what's happening there. Could you elaborate more please?

Also about the image binaries you specified above, could we please get a little more clarification on the same. Is it binaries used while building the image?

I meant, all docker images should be rebuild for needed architecture and use rebuilded images. Be careful with what you add/copy on build phase (to not to mess up with architectures).

Is it possible to share with us how you got the flow of what agents are being invoked, where and how they are being used in the image.

Unfortunately I don't remember the full flow, it was almost a year ago. But my advice is to try what you have and look at errors in logs. As far as I understand the problem is not in dashboard, it is deeper. Che Dashboard is only UI for Che API. Take a look at Che server logs, Workspace logs if any. Then try to figure out what's wrong and why. At least that how I did it.

mmorhun commented 5 years ago

As for agents list, you have them in installers section of your workspace config.

l0rd commented 5 years ago

As @mmorhun and @skabashnyuk mentioned, the work needed to port a Che workspaces to a different architecture, is different based on the Che version you are targeting (v6 or v7). And my recommendation is to target Che 7 because this is the future. Che 6 workspaces will be deprecated (not usable anymore) in the next Che release.

Anyway I strongly support the work you are doing. This is something that will be useful for anyone that wants to port Che to a new architecture. Hence I would like to propose to work on a How-To document that will eventually be part of Che documentation. If you are ok with that please reach out on mattermost (you can ping me directly as @mario there so that we can discuss about how to proceed).

lysannef commented 5 years ago

As for agents list, you have them in installers section of your workspace config.

This won't be applicable if we using Che 7? I wont have to generate binaries for each of the agents.

Sorry, cannot understand what's happening there. Could you elaborate more please?

The eclipse-che folder loaded into the che-server image is one generated by an mvn clean install of the assembly code (che/assembly/assembly-main).

Dockerfile for generating che-server image:

ADD eclipse-che /home/user/eclipse-che
RUN find /home/user -type d -exec chmod 777 {} \;

This folder contains the x86 agent binaries. I recompiled the binaries for ppc64le architecture and replaced the loaded code for exec and terminal agent with the same. Executing these binaries on power worked well. Workspace creation however still threw an error. Also if agents are not needed for Che 7 , wouldn't the loaded folder in the docker image still contain x86 binaries

Take a look at Che server logs, Workspace logs if any. Then try to figure out what's wrong and why. At least that how I did it.

@mmorhun Since we are running this as part of docker images, the only log I have found is through docker logs of the container, attached in this issue . Is there anywhere else that I can obtain these che-server logs. Is there a command to make it verbose . I will dig deeper into the logs I have but it doesn't seem like much.

lysannef commented 5 years ago

Hence I would like to propose to work on a How-To document that will eventually be part of Che documentation. If you are ok with that please reach out on mattermost (you can ping me directly as @mario there so that we can discuss about how to proceed).

@l0rd I would be really interested in creating such a document, however my primary concern at the moment is on resolving the issues we are facing with the che functionality. Porting che on ppc64le has currently become a very high priority task. Once complete I can work with you on creating the document :-)

skabashnyuk commented 5 years ago

@lysannef

2019-04-22 09:31:36,154[nio-8080-exec-9]  [ERROR] [o.a.c.c.C.[.[.[/api].[default] 236]

is that all logs you have on che server startup? Can you share more? Can you try with kubernetes infrastructure instead of docker?

mmorhun commented 5 years ago

@lysannef

This won't be applicable if we using Che 7? I wont have to generate binaries for each of the agents.

Yes, exactly, you don't need to build exec binaries, you'll need to build exec container which is used in the corresponding plugin.

The eclipse-che folder loaded into the che-server image is one generated by an mvn clean install of the assembly code (che/assembly/assembly-main).

Yes you are doing it right. In case of Che 6 you need to make sure that binaries of the agents of the needed architecture you have built before are packed, so on workspace start it downloads proper binaries. For Che 7 you need to investigate if there is a component which should be treated that way. For now I cannot see one, but that only from first sight.

Also if agents are not needed for Che 7 , wouldn't the loaded folder in the docker image still contain x86 binaries

I think they are packed for backward compatibility, but I am not sure. @skabashnyuk should know more. But if you really target Che 7, I think you don't have to care.

About logs. I used docker logs -f for Che server and starting workspace container.

l0rd commented 5 years ago

@lysannef I have started this doc. Of course I haven't tested it and I may have missed one image so please let me know how it goes

lysannef commented 5 years ago

@lysannef I have started this doc. Of course I haven't tested it and I may have missed one image so please let me know how it goes

@l0rd I'm working on building these additional images mentioned and will test the same, will keep you updated on how it goes. :-)

lysannef commented 5 years ago

Hi @ l0rd , following the doc provided by you, we were able to resolve this issue and were able to validate the working of eclipse/che on ppc64le.

Thanks and closing the task for the same.