gitpod-io / gitpod

The developer platform for on-demand cloud development environments to create software faster and more securely.
https://www.gitpod.io
GNU Affero General Public License v3.0
13.02k stars 1.25k forks source link

Request createWorkspace failed with message: 14 UNAVAILABLE #2004

Closed jgallucci32 closed 3 years ago

jgallucci32 commented 4 years ago

Describe the bug

When creating a new workspace the following error is returned from the Web UI

Error: Request createWorkspace failed with message: 14 UNAVAILABLE: failed to connect to all addresses

Steps to reproduce

  1. Deploy Gitpod self-hosted (clean install)
  2. Connect to Gitlab repository

Expected behavior

Workspace launches

Additional information

Red Hat 7.8 (Docker CE 19.03.13) Self-hosted installation (0.40) integrated with GitLab (on-premise)

URL: https://gitpod.domain.local/#https://gitlab.domain.local/joesmith/myrepo/-/tree/master/

Error from server

{"@type":"type.googleapis.com/google.devtools.clouderrorreporting.v1beta1.ReportedErrorEvent","serviceContext":{"service":"server","version":"v0.4.0"},"stack_trace":"Error: 14 UNAVAILABLE: failed to connect to all addresses
    at Object.exports.createStatusError (/app/node_modules/grpc/src/common.js:91:15)
    at Object.onReceiveStatus (/app/node_modules/grpc/src/client_interceptors.js:1204:28)
    at InterceptingListener._callNext (/app/node_modules/grpc/src/client_interceptors.js:568:42)
    at InterceptingListener.onReceiveStatus (/app/node_modules/grpc/src/client_interceptors.js:618:8)
    at callback (/app/node_modules/grpc/src/client_interceptors.js:845:24)","component":"server","severity":"ERROR","time":"2020-10-15T21:32:37.208Z","environment":"production","region":"local","message":"Request createWorkspace failed with internal server error","error":"Error: 14 UNAVAILABLE: failed to connect to all addresses
    at Object.exports.createStatusError (/app/node_modules/grpc/src/common.js:91:15)
    at Object.onReceiveStatus (/app/node_modules/grpc/src/client_interceptors.js:1204:28)
    at InterceptingListener._callNext (/app/node_modules/grpc/src/client_interceptors.js:568:42)
    at InterceptingListener.onReceiveStatus (/app/node_modules/grpc/src/client_interceptors.js:618:8)
    at callback (/app/node_modules/grpc/src/client_interceptors.js:845:24)","payload":{"method":"createWorkspace","args":[{"contextUrl":"https://gitlab.devlnk.net/john.gallucci/bisf-cli/-/tree/master/","mode":"select-if-running"},{"_isCancelled":false}]}}

Example repository

n/a

philjak commented 4 years ago

I can confirm I have the same issue with my docker-compose self-hosted Gitpod. After fixing issue #1906 this is the current error that pops up.

corneliusludmann commented 4 years ago

Self-hosted installation (0.40) integrated with GitLab (on-premise)

I also had this error with 0.4.0 occasionally. Usually, a re-deploy fixed the problem.

Since 0.5.0 this has never happened to me again. Have you tried to upgrade to version 0.5.0?

See also: https://community.gitpod.io/t/clean-install-and-unable-to-launch-workspace/1547

philjak commented 4 years ago

So I'm using the current latest tag from eu.gcr.io/gitpod-core-dev/build/gitpod-k3s

This uses eu.gcr.io/gitpod-io/self-hosted/theia-server:0.5.0

jgallucci32 commented 4 years ago

I ran some wireshark captures on the pods and what I find strange is the server pod is making a tcp/8080 connection to the image builder and getting connection resets. This is surprising because the image-builder is not listening on tcp/8080 so why would the server pod be making this connection attempt in the first place?

@corneliusludmann I see only 0.5.0 available for gitpod chart and not gitpod-selfhosted which is only available up to 0.4.0. Will the regular gitpod chart work the same on-premise?

jgallucci32 commented 4 years ago

Well the server is definitely configured to communicate to image-builder over tcp/8080. Here is the list of environment variables from within the server pod:

unode@server-85d4499574-p8cf5:/app/node_modules/@typefox/server$ env | grep IMAGE_BUILDER
IMAGE_BUILDER_SERVICE_HOST=10.43.101.132
IMAGE_BUILDER_PORT_8080_TCP_PROTO=tcp
IMAGE_BUILDER_PORT=tcp://10.43.101.132:8080
IMAGE_BUILDER_PORT_8080_TCP_ADDR=10.43.101.132
IMAGE_BUILDER_SERVICE_PORT_RPC=8080
IMAGE_BUILDER_PORT_8080_TCP_PORT=8080
IMAGE_BUILDER_PORT_8080_TCP=tcp://10.43.101.132:8080
IMAGE_BUILDER_SERVICE_PORT=8080

However, I can verify that image-builder is in fact NOT listening on tcp/8080 so perhaps this has not been deployed correctly (even though it is marked as active)

corneliusludmann commented 4 years ago

Thanks for your analysis, @jgallucci32.

I see only 0.5.0 available for gitpod chart and not gitpod-selfhosted which is only available up to 0.4.0. Will the regular gitpod chart work the same on-premise?

The gitpod-selfhosted repo is deprecated. I just added a note. The Gitpod 0.5.0 helm charts work more or less the same. You'll find sample values.yaml files at https://github.com/gitpod-io/gitpod/tree/master/chart and https://github.com/gitpod-io/gitpod/tree/master/install/helm.

It would be great if you could check if you experience the same with Gitpod 0.5.0. Probably @csweichel could have a look at your findings.

philjak commented 4 years ago

@corneliusludmann I'm not sure if it's helpful, but I'm experiencing the same issue when using the provided docker-compose.yaml. I'm happy to provide some logs if you tell me what you need.

Thanks, philjak

jgallucci32 commented 4 years ago

@corneliusludmann @philjak I was able to get past this issue with a workaround. It came down to the MTU setting of the Docker-in-Docker image being set to 1500 when Kubernetes (which uses Calico networking) has an overlay with MTU 1450 on the base container/pod.

In order to fix this I had to add the flag --mtu=1450 to the entrypoint for the image-builder pod. Here is a snippet of the manifest for it:

      - args:
        - dockerd
        - --userns-remap=default
        - -H tcp://127.0.0.1:2375
        - --mtu=1450

Apparently this is a known issue with K8s + DinD + Alpine running an apk fetch as noted in this Github issue https://github.com/gliderlabs/docker-alpine/issues/307

philjak commented 4 years ago

Awesome @jgallucci32 ! I can confirm. Edited the image-builder deployment and it's working!

BenjaminBeichler commented 4 years ago

@philjak could you tell, what is exactly needed to change in the docker-compose.yaml ?

BenjaminBeichler commented 4 years ago

okay, my current fix is to create a new volume which maps /chart/templates to a local folder and insert a modified image-builder-deployment.yaml

but this problem should be fixed in the dockerimage, I think the mtu could statically be changed without great impact also to non DinD environments

philjak commented 4 years ago

Thanks @BenjaminBeichler for sharing. I really just edited the running deployment - so this was no persistent change. But until the issue has been fixed, I guess I'll also try to using a temporarily volume.

corneliusludmann commented 4 years ago

@jgallucci32 Thank you very much for investigating this issue. :+1:

To make sure that this fixes the issue I ran some tests with the docker-compose.yaml setting: Without your fix, in 4 of 10 cases, I get the “failed to connect to all addresses” error after deployment. With your fix, I successfully deployed Gitpod 10 times in a row without this error. It's still a rather small sample, but I am convinced that this fixes the bug. :smile:

As already described, a temporary fix would be to mount a patched image-builder-deployment.yaml into the chart folder. Add this to your docker-compose.yaml volumes section:

- ./image-builder-deployment.yaml:/chart/templates/image-builder-deployment.yaml

You can get this patched file e.g. by running:

$ docker-compose exec gitpod sed 's/"dockerd"/"dockerd", "--mtu=1450"/' /chart/templates/image-builder-deployment.yaml > ../image-builder-deployment.yaml

I'll gonna create a PR to fix this soon.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

braghettos commented 3 years ago

I'm getting the same error on a self-hosted gitpod setup in a kubernetes environment. I get a 101 http return code from the API wss://kerberus-gitpod.ddns.net/api/gitpod. Where can I find the root cause?

braghettos commented 3 years ago

This is the log of the call:

{"upstreamAddr":"10.43.126.88:3000","requestScheme":"https","requestHost":"kerberus-gitpod.ddns.net","requestTime":"40.006","remotePort":"6033","serverName":"kerberus-gitpod.ddns.net","httpRequest":{"requestSize":"776","responseSize":"408","userAgent":"Mozilla\/5.0 (Linux; Android 6.0; Nexus 5 Build\/MRA58N) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/90.0.4430.93 Mobile Safari\/537.36","remoteIp":"10.42.3.0","serverIp":"10.42.1.12","latency":"40.006","protocol":"HTTP\/1.1","requestMethod":"GET","status":"101","requestUrl":"\/api\/gitpod"},"connection":"2235","httpConnection":"Upgrade","cookies":{"_kerberus_gitpod_ddnsnet":"s%3Adf177a24-48ac-41d1-a0b6-ab9a7c222ac8.UGFFnw2CnLs6Niarm%2BsqVoU%2F2h4Y%2BHeTRRJVYMOPXq0","gitpod-user":"loggedIn","user-platform":"6c7f835a-6560-44e1-99d4-973ed6fbdb06"},"httpUpgrade":"websocket","proxyHost":"ws-apiserver","serverPort":"443","proxyPort":"80","severity":200,"requestHeaders":{"pragma":"no-cache","origin":"https:\/\/kerberus-gitpod.ddns.net","sec-websocket-version":"13","host":"kerberus-gitpod.ddns.net","cache-control":"no-cache","accept-language":"it-IT,it;q=0.9,en-US;q=0.8,en;q=0.7,es;q=0.6","user-agent":"Mozilla\/5.0 (Linux; Android 6.0; Nexus 5 Build\/MRA58N) AppleWebKit\/537.36 (KHTML, like Gecko) Chrome\/90.0.4430.93 Mobile Safari\/537.36","sec-websocket-extensions":"permessage-deflate; client_max_window_bits","sec-websocket-key":"Cw8b4OhPqB43Iy3RdKmkBw==","accept-encoding":"gzip, deflate, br","connection":"Upgrade","upgrade":"websocket"},"responseHeaders":{"upgrade":"websocket","access-control-allow-origin":"https:\/\/kerberus-gitpod.ddns.net","sec-websocket-extensions":"permessage-deflate","access-control-expose-headers":"Authorization","sec-websocket-accept":"Ssra3rclUOaDK0KWtPzOfzOazRk=","access-control-allow-credentials":"true","connection":"upgrade","x-gitpod-region":"production.gitpod.local.00"},"context":{"sessionId":"df177a24-48ac-41d1-a0b6-ab9a7c222ac8"}}

corneliusludmann commented 3 years ago

@braghettos: Would you mind open a thread in our community forum (feel free to link this issue in your post)? Please add additional information:

braghettos commented 3 years ago

Hi @corneliusludmann , I created a new thread: https://community.gitpod.io/t/request-createworkspace-failed-with-message-14-unavailable/3472.