typokign / matrix-chart

Helm chart for deploying a Matrix homeserver stack
MIT License
88 stars 47 forks source link

Federation Port 8448 is not encrypted with TLS #27

Closed Routhinator closed 4 years ago

Routhinator commented 4 years ago

This might be a misunderstanding on my part so I'll run down my thoughts here:

typokign commented 4 years ago

Good questions. Yes, you will unfortunately need to put the federation service behind a TLS proxy.

Since this is a non-standard port, it cannot be behind Ingress-nginx

Not just the NGINX Ingress Controller, but the k8s spec for Ingresses doesn't allow you to customize the port at all :frowning_face:

Can I tell the federation servers to use 443 somehow perhaps?

AFAIK, no, you can customize the federation port but the C-S and S-S APIs must be on different ports. Since k8s only allows Ingresses on ports 80/443, changing the port doesn't really help.

If it cannot be proxied behind a TLS proxy, it needs to encrypt it's own connection with TLS certs as it needs to be connected to a loadbalancer directly.

Definitely something I would like to consider, but haven't implemented yet for the same reasons as Coturn (#5)

Currently this cannot be federated on a standard kube deployment as servers expect to talk TLS to the 8448 port.

"Standard" and "Kubernetes" are pretty antithetical :stuck_out_tongue: If you're running in AWS or GCP, you can configure the associated load balancers to terminate TLS traffic. Home-rolled kubeadm deployment on a raspberry pi in your basement? Yeah, getting this to federate will take some more work on your part.

Again, I'm hoping to put a lot more time into this chart in the next couple of months, and in-cluster TLS for federation and Coturn are pretty high up on my priority list. PRs are always welcome too! :)

Routhinator commented 4 years ago

I could move TLS to the loadbalancer on DO but not sure I want to if it will break my cluster's automatic domain provisioning and I have to start adding dns records for new deployments... Other option would be to pay for another loadbalancer but that would put my bill from tolerable to uncomfortably high.

I guess the tertiary option here would be an nginx container with cert manager certs that proxies the 8448 port.

Routhinator commented 4 years ago

Thinking on this more overnight I'm going to work on a POC that uses a custom NGINX container and cert-manager to proxy both Coturn and the Federation port.

I really don't want to move TLS to my load balancer as that leaves traffic from the load balancer to the kube nodes in the clear.

Routhinator commented 4 years ago

I have an MR coming for this which uses the existing ingress TLS cert for the federation TLS. It's tested and working. I'll implement the same change for coturn once merged.

typokign commented 4 years ago

o_O You can do that automatically? I'll have to test for compatibility but that sounds like a perfect solution. Thanks a bunch!

Routhinator commented 4 years ago

... So was reading through this doc https://www.natrius.eu/dokuwiki/doku.php?id=digital:server:matrixsynapse

And it struck me.. The federation needs to be on a seperate port, however that can still actually be 443 - if its on a seperate domain.

So

matrix.example.com: 443 --> 8008 client/etc synapse.example.com: 443 --> 8448 federation

with the appropriate SRV records and .well-known setup this will work

typokign commented 4 years ago

That's a great idea, thanks! Just asked around and seems like that's perfectly valid, both spec-wise and there shouldn't be any compatibility issues with Riot or other federating homeservers.

I'll get started on implementing this right away, looking forward to shutting down an unnecessary load balancer :)

typokign commented 4 years ago

Fixed in https://github.com/dacruz21/matrix-chart/commit/8ff92ab3f8b2c49b7f29ebe6338edefc9c6690ae

Routhinator commented 4 years ago

@dacruz21 Looks like there is a fatal flaw in the ingress implementation - federation connections resolve the IP of the load balancer and connect to https://x.x.x.x/_matrix

This doesn't work because ingress-nginx 404s on those requests.. whereas it was working for my implementaion on the seperate port.

I think we'll need one more ingress that has a wildcard host, and specifies the /_matrix/ path to catch these.

It should also listen on /.well-known/matrix/ to support the well-known stanzas as these come into routh.io/.well-known/matrix/** instead of matrix.routh.io or synapse.routh.io

typokign commented 4 years ago

https://matrix.org/docs/spec/server_server/r0.1.4#resolving-server-names

If is not an IP literal and no is present, an SRV record is looked up for _matrix._tcp.. This may result in another hostname (to be resolved using AAAA or A records) and port. Requests should be made to the resolved IP address and port with a Host header containing the . The target server must present a valid certificate for .

Routhinator commented 4 years ago

I have that DNS record present. Federation still operates on the IP.

https://federationtester.matrix.org/#routh.io

You can see it gets the proper domain, but it gets a 404 because it actually makes the request to the https://104.248.105.145/_matrix/key/v2/server. And ingress-nginx cannot answer that without an ingress listing for that IP as a server directive.

When I go to the url https://synapse.routh.io/_matrix/key/v2/server which is what it should go to it works.

My federation is also broken after applying this change.

typokign commented 4 years ago

The federation tester frontend kinda sucks, I usually look at https://federationtester.matrix.org/api/report?server_name=routh.io

From there you can see the real issue, "Get https://routh.io/.well-known/matrix/server: x509: certificate is valid for ingress.local, not routh.io"

I would try to fix that issue rather than use SRV records. It's been a while since I've asked but last I heard SRV records had a number of issues and it was highly recommended to just use .well-known for federation resolution.

Routhinator commented 4 years ago

So that still leaves me with needing an ingress that listens on routh.io/.well-known/matrix/server while my primary ingress lives on matrix.routh.io or synapse.routh.io

Theres a way to listing on all hosts with a snippet which may be desirable for the .well-known stuff..

https://github.com/nginxinc/kubernetes-ingress/issues/209#issuecomment-581691384 - however it would need to live on a seperate ingress definition as you would not want to bind the / path with that.

Alternatively there could be an ingress that listens on very specific paths on the root domain.

typokign commented 4 years ago

Yeah, it was a conscious decision to not include a .well-known/matrix/server handler in the chart, mainly because I assume most people already have a static site at the root of their domain which may or may not be hosted from the same cluster. Seems totally overkill to me to host an entire nginx pod just to serve a little JSON object at one URL.

Routhinator commented 4 years ago

I agree there. But don't need a pod.. just a nginx.ingress.kubernetes.io/configuration-snippet

Something along this line

annotations:
  nginx.ingress.kubernetes.io/configuration-snippet: |
    location /.well-known/matrix/server {
      return 200 '{"m.server": "example.com:443"}';
      add_header Content-Type application/json;
    }
    location /.well-known/matrix/client {
      return 200 '{"m.homeserver": {"base_url": "https://example.com"},"m.identity_server": {"base_url": "https://vector.im"}}';
      add_header Content-Type application/json;
      add_header "Access-Control-Allow-Origin" *;
    }
Routhinator commented 4 years ago

Sorry - would need to be a server-snippet:

https://kubernetes.github.io/ingress-nginx/user-guide/nginx-configuration/annotations/#server-snippet

typokign commented 4 years ago

Cool, didn't know the nginx ingress provider actually supported that. But still, that's not a standard k8s API, so I'm super hesitant to add anything that makes this chart less portable, and needs to be disabled if you're using a different ingress provider, and requires you to update your DNS if you were hosting your root domain outside of the cluster.

Looks like you already have a static site at your root domain, why not just add the .well-known to your files there? That's what I do: https://github.com/dacruz21/website/blob/master/.well-known/matrix/server

Or if you want to keep this in an Ingress, and you're using an umbrella chart (which I recommend over cloning this repo), you can add a barebones Ingress to your templates/ dir and just put the snippet there. It won't deploy any extra proxies or cost you anything.

Routhinator commented 4 years ago

I get where you're coming from. I don't actually have a static site, just a default-backend container with 404 docs.. I could shove this in there however I would prefer to have something like this (which is very app specific) configured with the app rather than in a main root site statically.

So I took the template suggestion. Maybe you might consider option features that do not lock a user into ingress-nginx but simply support it if its being used since it's the most widely used ingress controller.

This was my solution.. (not sure what to bind the indentity server to yet though)

---
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: {{ include "matrix.fullname" . }}-wellknown
  {{- with .Values.matrix.ingress.annotations }}
  annotations:
    {{- toYaml . | nindent 4 }}
  {{- end }}
    nginx.ingress.kubernetes.io/configuration-snippet: |
      location /.well-known/matrix/server {
        return 200 '{"m.server": "{{ .Values.matrix.ingress.hosts.federation }}:443"}';
        add_header Content-Type application/json;
      }
      location /.well-known/matrix/client {
        return 200 '{"m.homeserver": {"base_url": "https://{{ .Values.matrix.ingress.hosts.synapse }}"},"m.identity_server": {"base_url": "https://vector.im"}}';
        add_header Content-Type application/json;
        add_header "Access-Control-Allow-Origin" *;
      }
spec:
{{- if .Values.matrix.ingress.tls }}
  tls:
    {{- range .Values.matrix.ingress.tls }}
    - hosts:
      {{- range .hosts }}
      - {{ . | quote }}
      {{- end }}
      secretName: {{ .secretName }}
    {{- end }}
{{- end }}
  rules:
    - host: {{ .Values.matrix.ingress.hosts.root }}
      http:
        paths:
          - path: "/.well-known/matrix/"
            backend:
              serviceName: "{{ include "matrix.fullname" . }}-synapse"
              servicePort: {{ .Values.matrix.synapse.service.port }}
typokign commented 4 years ago

Your ingress definition looks good. Yeah, I'll definitely consider it, but it's low on my priority list and I would want to leave it disabled by default (so that it's not a breaking change for anyone running this chart without the nginx provider). I just fear it will get lost in the sea of values, so I'm going to wait until I get around to writing some docs for this thing so it doesn't get forgotten.

Identity server doesn't really matter all that much. It just allows you to attach an email address to your matrix username so that others can look you up by email. You could remove it entirely or leave it as vector.im (vector.im's identity server is public).

I'll be adding the ma1sd identity server to this chart eventually, but it will probably be disabled by default. (#18)

Looks like you're passing the federation tester :tada: so I'll close this issue now. Let me know if you have any more questions.