eclipse-che / che

Kubernetes based Cloud Development Environments for Enterprise Teams
http://eclipse.org/che
Eclipse Public License 2.0
6.99k stars 1.19k forks source link

Issues with Websockets for Che with Nginx-Ingress Controller. #23049

Open Wosin opened 3 months ago

Wosin commented 3 months ago

Summary

Hey! I am facing an issue with Eclipse Che deployment on Vanilla k8s cluster running with Keycloak as OIDC provider using nginx-ingress controller. We have managed to pretty much configure and make everything work correctly, the only issue is the websocket connection.

We are using the below patch to deploy che:

apiVersion: org.eclipse.che/v2
spec:
  devEnvironments:
     serviceAccount: default
     defaultNamespace:
       autoProvision: false
  networking:
    ingressClassName: nginx
    annotations:
      acme.cert-manager.io/http01-edit-in-place: "true"
      cert-manager.io/cluster-issuer: le-wildcard-issuer
      nginx.ingress.kubernetes.io/ssl-redirect: "true"
    auth:
      gateway:
        oAuthProxy:
         cookieExpireSeconds: 300
        deployment:
          containers:
          - name: "oauth-proxy"
            env:
            - name: OAUTH2_PROXY_COOKIE_CSRF_PER_REQUEST
              value: "true"
            - name: OAUTH2_PROXY_PASS_AUTHORIZATION_HEADER
              value: "true"
            - name: OAUTH2_PROXY_WHITELIST_DOMAINS
              value: "[keycloak-url]"
            - name: OAUTH2_PROXY_COOKIE_REFRESH
              value: "200s"
      identityProviderURL: [keycloak-url]
      oAuthClientName: kubernetes-client
      oAuthSecret: XXXXXXXX

As I've mentioned everything in general works okay, but the websocket connetions to /dashboard/api/websocket are only working for as long as the initial oauth cookie is valid, after that they are failing with No valid authentication in request. Initiating login. and we see the WebSocket connections are failing. Refer to "Network Troubleshooting" in the user guide. error on dashboard screen.After manual refresh everything is back to normal, again for the validity time of the cookie set in the configuration.

Is there any documentation about setting up Che with Nginx to make sure the websocket connetions are working correctly ?

Relevant information

No response

ibuziuk commented 3 months ago

@tolusha ptal

ibuziuk commented 3 months ago

@Wosin hello, could you please clarify if you followed https://eclipse.dev/che/docs/stable/administration-guide/installing-che-on-the-virtual-kubernetes-cluster/ ? if smth. is not working as expected PR to the docs should be provided to improve the installation SOP on vanilla k8s

tolusha commented 3 months ago

Setting spec.networking.annotations overrides the default ingress anotations which are:

"nginx.ingress.kubernetes.io/proxy-read-timeout":    "3600",
"nginx.ingress.kubernetes.io/proxy-connect-timeout": "3600",
"nginx.ingress.kubernetes.io/ssl-redirect":          "true",
"nginx.ingress.kubernetes.io/proxy-buffer-size": "16k"
"nginx.org/websocket-services": "che-gateway"

Could you add them as well ?

brunnels commented 3 months ago

I was able to resolve this I think. I needed to set the oauth cookie expiration and then I had to add a server-snippet to get the websockets working. I'm using authelia for oidc.

apiVersion: org.eclipse.che/v2
kind: CheCluster
metadata:
  name: eclipse-che
  namespace: eclipse-che
spec:
  components:
    cheServer:
      extraProperties:
        CHE_OIDC_USERNAME__CLAIM: email
  networking:
    annotations:
      external-dns.alpha.kubernetes.io/target: che.mydomain.dev
      kubernetes.io/ingress.class: internal
      nginx.ingress.kubernetes.io/proxy-buffer-size: 16k
      nginx.ingress.kubernetes.io/proxy-connect-timeout: "3600"
      nginx.ingress.kubernetes.io/proxy-read-timeout: "3600"
      nginx.ingress.kubernetes.io/rewrite-target: /
      nginx.ingress.kubernetes.io/secure-backends: "true"
      nginx.ingress.kubernetes.io/ssl-redirect: "true"
      nginx.org/websocket-services: che-gateway
      nginx.ingress.kubernetes.io/server-snippets: |
       location / {
        proxysetheader Upgrade $httpupgrade;
        proxyhttpversion 1.1;
        proxysetheader X-Forwarded-Host $httphost;
        proxysetheader X-Forwarded-Proto $scheme;
        proxysetheader X-Forwarded-For $remoteaddr;
        proxysetheader Host $host;
        proxysetheader Connection "upgrade";
        proxycachebypass $httpupgrade;
        }
    auth:
      gateway:
        oAuthProxy:
          cookieExpireSeconds: 300 # needs to be shorter than the oidc token lifespan
      identityProviderURL: https://auth.mydomain.dev
      oAuthClientName: oauth2-proxy
      oAuthSecret: SUPERSECRETOIDC
    domain: che.mydomain.dev
tolusha commented 3 months ago

@brunnels Good to know. Are you interested in contribution some documentation [1] ? That would be really cool.

[1] https://eclipse.dev/che/docs/stable/administration-guide/installing-che/

brunnels commented 3 months ago

@tolusha I'm working on getting this all installable and working via flux2 kustomize that people using k8s can use as an example. I'll have a readme in there explaining how to setup the oidc definition in authelia and add the ClusterRole for each che user. I'll reply here with a link once it's done.

brunnels commented 3 months ago

@tolusha It's almost there but I'm seeing some inconsistencies with the CheCluster v2 crd and what's actually happening.

Things like oAuthSecret supporting the actual secret value or the name of a secret in the namespace. I'm not seeing that it's pulling the value from the secret.

It's also not clear what clusterroles my users need. The docs make it seem like just adding them to the advancedAuthorization settings should work but this doesn't do anything. I need to add a ClusterRoleBinding for a user to cluster-admin before things start to work and I'm sure that's not right.

In any case, here's current progress. https://github.com/brunnels/talos-cluster/tree/main/kubernetes/apps/eclipse-che

Is there a discord or similar where we could discuss more?

brunnels commented 3 months ago

@tolusha turns out can't deploy without chectl right now. The che operator helm chart doesn't provide everything that's needed so it's a dead end on vanilla k8s. https://github.com/eclipse-che/che-operator/issues/1655

tolusha commented 3 months ago

@brunnels DWO is a prerequisite. chectl doesn't do anything fancy just apply the resources [1] So, you can follow the same approach kubectl apply -f https://github.com/devfile/devworkspace-operator/blob/main/deploy/deployment/kubernetes/combined.yaml

[1] https://github.com/devfile/devworkspace-operator/blob/main/deploy/deployment/kubernetes/combined.yaml