eclipse-theia / theia-cloud-helm

Eclipse Public License 2.0
6 stars 7 forks source link

Instance certificate faulty #71

Closed iyannsch closed 2 months ago

iyannsch commented 3 months ago

Hey folks, I am experiencing a certificate problem with all theia-cloud-session pages/instances. When accessing the instance URL, the certificate is flagged as invalid - it's also issued by Kubernetes Ingress Controller Fake Certificate.

In the helm charts, I'm using the letsencrypt-prod ClusterIssuer under ingress. The issuer is working correctly across the cluster and was issuing the certs before the update too.

k get clusterissuer
NAME                            READY   AGE
letsencrypt-prod                True    124d

Taking a closer look at the certificates, the problem becomes visible

k get certificate
NAME                       READY   SECRET                     AGE
landing-page-cert-secret   True    landing-page-cert-secret   6d16h
service-cert-secret        True    service-cert-secret        4d21h
ws-cert-secret             False   ws-cert-secret             4d21h

Diving even deeper into the reasons for that, I found out that the secret used in the certificate is not properly formatted/stored as a kubernetes.io/tls type (containing tls.key and tls.crt) but as Opaque (only containing tls.key).

k get secret
NAME                                TYPE                 DATA   AGE
landing-page-cert-secret            kubernetes.io/tls    2      6d16h
service-cert-secret                 kubernetes.io/tls    2      6d16h
sh.helm.release.v1.theia-cloud.v1   helm.sh/release.v1   1      6d16h
ws-cert-secret-ls9ft                Opaque               1      4d22h
ws-cert-secret-n8xg2                Opaque               1      4d21h

Taking a look at cert-manager's orders, we can see that the respective entries for the session instances are pending instead of valid. For the requested hostnames instance... and *.webview.instance..., the order Failed to determine a valid solver configuration for the set of domains on the Order: no configured challenge solvers can be used for this challenge.

I think this is due to the newly added wildcard-URL-prefix *.webview which lets encrypt won't resolve with a HTTP-01 challenge but only with a DNS typed challenge due to their policies. On our cluster, DNS challenges are not configured and cannot be completed automatically.

As far as I can tell, our situation is pretty standard and most servers and clusters cannot complete DNS challenges. Do you agree? Is the problem rooted somewhere else?

lucas-koehler commented 3 months ago

Hi @iyannsch , thanks for the detailed report! The generation of certificates for wildcard subdomains via Let's encrypt indeed does not work out of the box. The following workarounds can be used:

  1. Disable the *.webview domain via the values. With this, webviews in vs code extension cannot work (as in older Theia Cloud versions). Set this in the values:
    hosts:
      allWildcardInstances: []
  2. Add your own certificate issue that can resolve DNS01 challenges for subdomains: https://cert-manager.io/docs/configuration/acme/dns01/
  3. Create/Get the certificates yourselves without cert-manager and make them available to Theia Cloud like so: https://theia-cloud.io/documentation/moredocumentation/#custom-certificates
iyannsch commented 3 months ago

Hi @lucas-koehler, thanks for the reply!

The first option is not feasible for our use case as many libraries for VSC plugins utilize webviews - we'll have to find a solution :) At TUM, we unfortunately only have very limited options to get a wildcard certificate making all of those steps quite hard to follow. Most larger corporations probably have similar guidelines, right?

Could you delve a bit further into the reasons for make the webviews accessible via a separate (sub)domain? Wouldn't it also work to expose the view as instance.domain.com/<uuid-of-instance>/wv/<uuid-of-webview>? Maybe @jfaltermeier also has ideas on this topic?

Looking forward to your response 😊

jfaltermeier commented 3 months ago

The {{uuid}}.webview.{{hostname}} is the default pattern of Theia: https://github.com/eclipse-theia/theia/blob/acaa3dfb6733665af5ab6ab9025fdf37ea758668/packages/plugin-ext/src/main/common/webview-protocol.ts#L18 (This also mentions that it can be changed)

Here you can find the PR that implemented this: https://github.com/eclipse-theia/theia/pull/6465 The PR links to https://blog.mattbierner.com/vscode-webview-web-learnings/ which has a lot of details.

As far as I know a pattern like instance.domain.com/<uuid-of-instance>/wv/<uuid-of-webview> would e.g. allow a malicious webview to access cookies of the main IDE and also of other webviews, since they all run on the same origin.

iyannsch commented 2 months ago

Thanks for your answer! We are working on getting a wildcard cert for this use case and should be fine with that. Your concerns in regards to the cookie access seem valid to me and promote the wildcard solution even more :)