Closed davhdavh closed 9 months ago
Duplicate of #90
Unfortunately, you're affected by the same problem as in #90 : cross-namespace Route bindings are not correctly implemented in STUNner (this is documented here). This was a simplification that made our life easier for the first couple of releases, but it is time to remove this subtlety for the better. We are actively working on fixing this, the progress is tracked here.
Until then, it's best to just put all Gateways and UDPRoutes into the same namespace and everything should work fine.
To actually answer your question: how you're trying to do it is exactly how it should be done, but until this far this has not been not possible due to a STUNner limitation that blocked UDPRoutes to attach to Gateways across a namespace boundary.
This should now fixed as of e770d05 in the gateway-operator repo. Currently this change is only available on the dev release channel from the stunner/stunner-gateway-operator-dev
chart:
helm install stunner-gateway-operator stunner/stunner-gateway-operator-dev --create-namespace --namespace=stunner-system
The below worked for me perfectly:
apiVersion: gateway.networking.k8s.io/v1alpha2
kind: Gateway
metadata:
name: udp-gateway
namespace: default
spec:
gatewayClassName: stunner-gatewayclass
listeners:
- name: udp-listener
port: 3478
protocol: UDP
allowedRoutes:
namespaces:
from: All
---
apiVersion: gateway.networking.k8s.io/v1alpha2
kind: UDPRoute
metadata:
name: iperf-server
namespace: team1
spec:
parentRefs:
- name: udp-gateway
namespace: default
rules:
- backendRefs:
- name: iperf-server
namespace: team1
---
apiVersion: gateway.networking.k8s.io/v1alpha2
kind: UDPRoute
metadata:
name: iperf-server
namespace: team2
spec:
parentRefs:
- name: udp-gateway
namespace: default
rules:
- backendRefs:
- name: iperf-server
namespace: team2
Note the allowedRoutes
config in the Gateway, this makes sure that your Gateway accepts routes from all namespaces.
Can you please report back on your findings? Thanks!
I can see I was a bit unclear. The problem is not the namespace issue. Thanks for the fix though, looking forward to .16. The problem is how does it know WHICH traffic to send to which backend? team1 and team2 is running different apps and thus need to work totally independent of each other. I could allocate a different port to each, but since there is no way to auto allocate a port, I would need to manually keep track of which is which. Also the authentication is in the GatewayConfig, but why would team1 and team2 not take care of their own credentials?
This is the tricky part: the name "UDPRoute" is somewhat misleading because STUNner actually never routes traffic per se. This is due to the way TURN (and WebRTC media) works: the client must know the IP address of the peer (i.e., the Kubernetes pod, or the iperf-server in your case) beforehand and send it along to STUNner, in order to be able communicate with the peer, and the UDPRoute serves merely as a mechanism for STUNner to check whether a client tries to reach a pod it is permitted to talk to (like an ACL). Exchanging the client and peer IP addresses is usually handled in an out-of-band ICE conversation between the client and the peer.
To answer your question:
I'm not saying this is optimal, but for WebRTC media to actually work, this how it has to be done. If still in trouble, feel free to drop by at our Discord, happy to help you.
Thanks, I totally missed that trick in the TURN/STUN spec.
Just to be absolutely sure I got it:
and the only reason you need a UDPRoute to begin with is to ensure that whatever IP the outside tells you to send traffic to is acceptable?
That makes me wonder why we need the gateway to being with then? Wouldn't it be simpler to offload that to traefik, haproxy or nginx or something else that can already do the udp routing?
Yes, that's a fairly good description of what's going on.
inside audio traffic from pod X -> Gateway -> stunner service -> stunner pod -> outside (or does pod X send to outside directly?)
No, everything goes through STUNner (see below).
and the only reason you need a UDPRoute to begin with is to ensure that whatever IP the outside tells you to send traffic to is acceptable?
Exactly.
That makes me wonder why we need the gateway to being with then? Wouldn't it be simpler to offload that to traefik, haproxy or nginx or something else that can already do the udp routing?
The problem is NAT traversal: when the media server lives in a Kubernetes pod it is hosted on a private IP and there are several layers of network address translation happening between it and the client, which stops them from communicating over WebRTC (WebRTC media servers identify users/media per the source IP/port: if the source IP/port changes, the audio/video traffic is dropped). You need an ingress gateway that runs TURN on top of UDP (TURN is a glorified tunneling protocol that allows the connection to survive any layers of NATs) and can inject your media into the private Kubernetes container network without breaking WebRTC media connections, and currently only STUNner provides this functionality. (To be absolutely fair any TURN server will do, but STUNner is designed specifically for this purpose.)
Is this nice? No. Is this how it must be done today with WebRTC? Absolutely. I too dream of a world where we don't need all this complexity any more, but we have to wait until media over QUIC/WebTransport/HTTP3 becomes a thing. Until then, however, you'll need a TURN gateway to run your media servers in Kubernetes, and STUNner provides you just that.
(To be absolutely fair any TURN server will do, but STUNner is designed specifically for this purpose.)
As I am reading this and out of curiosity: Is it possible to replace the pion TURN server in the data plane with another TURN server implementation?
As far as I understood the controller creates the relevant config parameters which the data plane TURN server adopts, no? So in theory, if another implementation could translate it to its configuration, it could work? Or is there any other magic I have not considered.
Thanks :+1:
Is it possible to replace the pion TURN server in the data plane with another TURN server implementation?
At the moment our Kubernetes Gateway API operator emits STUNner specific config files, so if someone goes out and writes a STUNner configfile parser for that "another TURN server" then the answer is yes. (Or you can rewrite the operator to emit different config file formats as well.) There is one trick though: STUNner knows how to reconcile config file updates incrementally, that is, without restarting the TURN server and dropping active connections. If that "other TURN server" lacks this capability then it will be hardly usable in Kubernetes, where the config file has to be updated frequently: clearly, you don't want to drop active clients when, say, you add a new backend pod or update the load-balancer settings...
At the moment our Kubernetes Gateway API operator emits STUNner specific config files, so if someone goes out and writes a STUNner configfile parser for that "another TURN server" then the answer is yes. (Or you can rewrite the operator to emit different config file formats as well.)
Yeah, that is what I meant with "if another implementation could translate it to its configuration".
There is one trick though: STUNner knows how to reconcile config file updates incrementally, that is, without restarting the TURN server and dropping active connections. If that "other TURN server" lacks this capability then it will be hardly usable in Kubernetes, where the config file has to be updated frequently: clearly, you don't want to drop active clients when, say, you add a new backend pod or update the load-balancer settings...
100% agree. Currently we maintain a TURN server called eturnal which supports config reloads without interrupting services. Actually, we work on supporting hot release upgrades as well. Maybe also interesting for your concept to test with one pod first before updating the entire helm release
, etc.
Sorry for hijacking this thread now, would you consider to include eturnal in your operator as well:
Thanks and have a great day!
Closing this for now. Feel free to reopen if anything new comes up.
I have tried searching the documentation, but it seems I cannot find information about how to have multiple distinct backends on the same frontend?
ie.
and then n backend services:
and
I don't really see anything in the gateway spec that would allow for this, and no attributes in stunner that would identify if traffic is sent to team1 or team2s iperf-server.