Possible Breakage in Jitsi Meet stable-7830

spijet commented 2 years ago

The new Jitsi Meet Docker images (tagged as stable-7830) introduce a breaking change in JVB config and it's template: now the JVB image expects a comma-separated list of IP addresses to advertise in the new ` environment variable. Before that we could specify a single IP address inDOCKER_HOST_ADDRESS` variable.

I'm going to update the JVB templates and add a way to specify this new variable, test it out locally and make a PR. I'll also leave the old variable intact for the time being, in case someone still uses older versions.

spijet commented 2 years ago

It seems that there's another issue caused by the way nginx now works with DNS:

23:10:58.512037 lxc6932168cbe00 In  IP (tos 0x0, ttl 64, id 46467, offset 0, flags [DF], proto UDP (17), length 74)
    10.42.0.89.53511 > 10.42.0.182.53: 47745+ A? jitsi-meet-prosody.jitsi.svc. (46)
23:10:58.525130 lxcef9865523545 In  IP (tos 0x0, ttl 64, id 64787, offset 0, flags [DF], proto UDP (17), length 149)
    10.42.0.182.53 > 10.42.0.89.53511: 47745 NXDomain$ 0/1/0 (121)

As you can see, it tries to resolve jitsi-meet-prosody.jitsi.svc. with a trailing dot, which in DNS world means "fetch me a jitsi-meet-prosody.jitsi domain name in the svc TLD". This isn't right, and there should be no trailing dot when trying to resolve Kubernetes service names. I'll have to poke around some more.

spijet commented 2 years ago

I think I managed to work around this issue by adding an optional Helm value .Values.global.clusterDomain, that will allow us to specify the k8s DNS domain name used in cluster. With this value set, all service references in ConfigMaps and Secrets will use FQDNs of k8s services (e.g. <service>.<namespace>.svc.cluster.local) instead of short <service>.<namespace>.svc form. This behaviour fits with nginx's assumption that all hostnames supplied to it are FQDNs.

Still gotta test it properly though. Will probably make a PR when the new work week starts.

devium commented 2 years ago

Not sure if this is the same issue but I ran into something similar upgrading from 7277-2 to 7882.

Apparently this change introduces variables to the NGINX config's proxy_pass directives. As described here using variables in proxy_pass requires you to manually specify a resolver.

So what I did was add the following line to meet.conf using an additional init script (see below): resolver rke2-coredns-rke2-coredns.kube-system.svc.cluster.local valid=30s; (This is for rke2 using CoreDNS. If using Kube-DNS use kube-dns.kube-system.svc.cluster.local as resolver instead)

The init script mounted into /etc/cont-init.d/20-config-manual:

#!/usr/bin/env bash
cat <<CONF>>/config/nginx/meet.conf
resolver rke2-coredns-rke2-coredns.kube-system.svc.cluster.local valid=30s;
CONF

(That's also how I modify config.js values that aren't exposed via the Docker image.)

Also, since the resolver directive ignores your resolve.conf, you need to specify the service domains all the way to the root. This can be done by modifying the XMPP_BOSH_URL_BASE environment variable, which in turn is set by prosody.server in the Helm chart:

prosody:
    server: <release_name>-prosody.<release_name>.svc.cluster.local

spijet commented 2 years ago

Aaand it's out: #60. :)

Feel free to clone the PR's source branch and test it out if you like. @sapkra, please review the PR when you have time. I'll now go and check if this branch still works with newer stable-7882 version. :)

saghul commented 2 years ago

@sapkra, please review the PR when you have time.

I don't think Paul is actively using Jitsi these days.

So what I did was add the following line to meet.conf using an additional init script (see below): resolver rke2-coredns-rke2-coredns.kube-system.svc.cluster.local valid=30s;

I think making the resolver configurable with env vars would be a good thing, a PR for that in tha main repo would be welcome.

sapkra commented 2 years ago

@saghul Yeah you're right, I'm not using Jitsi anymore but I was already trying to find another trustworthy maintainer but they are quite hard to find.

saghul commented 2 years ago

Maybe it would be good to put a note on the README so people can open an issue on meta and we can figure out the succession?

spijet commented 2 years ago

I'm ok with sending in PRs and hopping in to provide help in open issues every now and then, but I cannot guarantee being available for enough time to be a useful maintainer, unfortunately. Have a lot of work to do nowadays, but thankfully running Jitsi is a part of said work. :)

I think making the resolver configurable with env vars would be a good thing, a PR for that in tha main repo would be welcome.

@saghul, it's already configurable, AFAICT. The problem is that we need to: a) Set it up manually when running jitsi/web image in Kubernetes; b) Provide FQDNs for services (so we need to know the cluster's DNS domain beforehand). Both points are kiiinda covered already, since current master has .Values.resolverIP (which defaults to usual 10.43.0.10 found in many k8s installations) and #60 adds .Values.global.clusterDomain (which defaults to cluster.local).

The other option would be to somehow "teach" nginx to honor domain options from /etc/resolv.conf, but nginx's docs explicitly mention that using system resolver is undesirable, because it operates in blocking mode, which can hurt nginx's performance.

saghul commented 2 years ago

@saghul, it's already configurable,

Oh, looks like we added it then :-)

spijet commented 2 years ago

But it'd help to extract the nameserver from /etc/resolv.conf during cont-init, just in case. :)

saghul commented 2 years ago

I'm not a fan of implicit behavior, having an env var which sets it will make it more explicit.

Or are there some magical k8s internal fqdn s which will make it work out of the box?

spijet commented 2 years ago

There is a magical service FQDN that resolves to kube-dns or CoreDNS service IP in many cases, but it might be easier to just read the same service IP from /etc/resolv.conf, since Kubelet puts it into every pod/container it spawns.

As for explicit vs implicit — we might use an additional env var (e.g. AUTO_DETECT_RESOLVER or something like that) to enable this behaviour and default it to false. :)

saghul commented 2 years ago

There is a magical service FQDN that resolves to kube-dns or CoreDNS service IP in many cases, but it might be easier to just read the same service IP from /etc/resolv.conf, since Kubelet puts it into every pod/container it spawns.

As for explicit vs implicit — we might use an additional env var (e.g. AUTO_DETECT_RESOLVER or something like that) to enable this behaviour and default it to false. :)

That sounds like a great suggestion!

Note that our current default is the standard Docker default resolver, so maybe we should do it automatically if no resolver has been specified after all.

spijet commented 2 years ago

I'd say that (if no options are provided by the user) it's reasonable to read the resolver IP from /etc/resolv.conf (it'll have Docker's standard resolver if the image is running on Docker) and fall back to 127.0.0.11 if that was unsuccessful. :)

spijet commented 1 year ago

Since #60 is now merged to main, it's safe to close this issue.

Plans for next release:

Update to the latest stable Jitsi tag;
Test automatic DNS resolver IP discovery;
(come up with something else)

Thank you to everyone who took part in this! ❤️

spijet commented 1 year ago

Hello again @saghul!

Thank you for tagging a new version for Jitsi Meet images! I tested it in my setup and it works flawlessly. As promised, v1.3.0 is now up. :)

saghul commented 1 year ago

Excellent!

jitsi-contrib / jitsi-helm

Possible Breakage in Jitsi Meet stable-7830 #59