l7mp / stunner

A Kubernetes media gateway for WebRTC. Contact: info@l7mp.io
https://l7mp.io
MIT License
751 stars 58 forks source link

Turncat example not working on EKS #145

Closed drawnwren closed 5 months ago

drawnwren commented 5 months ago

Description

Possibly a clone of #133, I'm not sure.

Steps to Reproduce

Follow the turncat example with an eks cluster. Expected behavior: Turncat works as expected.

Actual behavior: First turncat invocation:

anubis :: ~ » turncat - k8s://xoul/udp-gateway:udp-listener udp://${PEER_IP}:9001 -v
14:24:05.091536 main.go:81: turncat-cli DEBUG: Reading STUNner config from URI "k8s://xoul/udp-gateway:udp-listener"
14:24:05.091599 main.go:163: turncat-cli DEBUG: Searching for CDS server
14:24:05.091605 k8s_client.go:154: cds-fwd DEBUG: Obtaining kubeconfig
14:24:05.093612 k8s_client.go:161: cds-fwd DEBUG: Creating a Kubernetes client
14:24:05.099574 k8s_client.go:196: cds-fwd DEBUG: Querying CDS server pods in namespace "<all>" using label-selector "stunner.l7mp.io/config-discovery-service=enabled"
14:24:05.640665 k8s_client.go:367: cds-fwd DEBUG: Found pod: xoul/stunner-gateway-operator-controller-manager-5d6c5cd8f8-sdwx2
14:24:05.640694 k8s_client.go:376: cds-fwd DEBUG: Creating a SPDY stream to API server using URL "https://CF330C06A766A5450BA821D9355D14DF.sk1.us-east-2.eks.amazonaws.com/api/v1/namespaces/xoul/pods/stunner-gateway-operator-controller-manager-5d6c5cd8f8-sdwx2/portforward"
14:24:05.640753 k8s_client.go:384: cds-fwd DEBUG: Creating a port-forwarder to pod
14:24:05.640764 k8s_client.go:400: cds-fwd DEBUG: Waiting for port-forwarder...
14:24:05.804371 k8s_client.go:419: cds-fwd DEBUG: Port-forwarder connected to pod xoul/stunner-gateway-operator-controller-manager-5d6c5cd8f8-sdwx2 at 127.0.0.1:44961
14:24:05.804513 cds_api.go:215: cds-client DEBUG: GET: loading config for gateway xoul/udp-gateway from CDS server 127.0.0.1:44961
14:24:05.918254 main.go:88: turncat-cli DEBUG: Generating STUNner authentication client
14:24:05.918299 main.go:95: turncat-cli DEBUG: Generating STUNner URI
14:24:05.918312 main.go:102: turncat-cli DEBUG: Starting turncat with STUNner URI: turn://k8s-xoul-udpgatew-77492a577e-37ff1616c9d9d844.elb.us-east-2.amazonaws.com:3478?transport=udp
14:24:05.951221 turncat.go:186: turncat INFO: Turncat client listening on file://stdin, TURN server: turn://k8s-xoul-udpgatew-77492a577e-37ff1616c9d9d844.elb.us-east-2.amazonaws.com:3478?transport=udp, peer: udp::9001
14:24:05.951249 main.go:118: turncat-cli DEBUG: Entering main loop
14:24:05.951303 turncat.go:227: turncat DEBUG: new connection from client /dev/stdin
14:24:05.951601 client.go:110: turnc DEBUG: Resolved STUN server 3.18.173.1:3478 to 3.18.173.1:3478
14:24:05.951966 client.go:119: turnc DEBUG: Resolved TURN server 3.18.173.1:3478 to 3.18.173.1:3478
14:24:06.027248 udp_conn.go:66: turnc DEBUG: Initial lifetime: 600 seconds
14:24:06.027338 udp_conn.go:81: turnc DEBUG: Started refresh allocation timer
14:24:06.027353 udp_conn.go:84: turnc DEBUG: Started refresh permission timer
14:24:06.027372 turncat.go:322: turncat INFO: new connection: client-address=/dev/stdin, relayed-address=10.0.139.119:46719
hello
14:24:09.140609 turncat.go:415: turncat DEBUG: cannot write to TURN relay connection for client /dev/stdin (likely harmless): invalid length of IP value
14:24:09.140632 turncat.go:336: turncat DEBUG: deleteConnection: cannot find client connection for file:/dev/stdin

And subsequent turncat invocations:

anubis :: ~ » turncat - k8s://xoul/udp-gateway:udp-listener udp://${PEER_IP}:9001 -v
14:24:30.184259 main.go:81: turncat-cli DEBUG: Reading STUNner config from URI "k8s://xoul/udp-gateway:udp-listener"
14:24:30.184398 main.go:163: turncat-cli DEBUG: Searching for CDS server
14:24:30.184411 k8s_client.go:154: cds-fwd DEBUG: Obtaining kubeconfig
14:24:30.185943 k8s_client.go:161: cds-fwd DEBUG: Creating a Kubernetes client
14:24:30.186357 k8s_client.go:196: cds-fwd DEBUG: Querying CDS server pods in namespace "<all>" using label-selector "stunner.l7mp.io/config-discovery-service=enabled"
14:24:30.760537 k8s_client.go:367: cds-fwd DEBUG: Found pod: xoul/stunner-gateway-operator-controller-manager-5d6c5cd8f8-sdwx2
14:24:30.760566 k8s_client.go:376: cds-fwd DEBUG: Creating a SPDY stream to API server using URL "https://CF330C06A766A5450BA821D9355D14DF.sk1.us-east-2.eks.amazonaws.com/api/v1/namespaces/xoul/pods/stunner-gateway-operator-controller-manager-5d6c5cd8f8-sdwx2/portforward"
14:24:30.760637 k8s_client.go:384: cds-fwd DEBUG: Creating a port-forwarder to pod
14:24:30.760649 k8s_client.go:400: cds-fwd DEBUG: Waiting for port-forwarder...
14:24:30.909747 k8s_client.go:419: cds-fwd DEBUG: Port-forwarder connected to pod xoul/stunner-gateway-operator-controller-manager-5d6c5cd8f8-sdwx2 at 127.0.0.1:46299
14:24:30.909831 cds_api.go:215: cds-client DEBUG: GET: loading config for gateway xoul/udp-gateway from CDS server 127.0.0.1:46299
14:24:31.021982 main.go:88: turncat-cli DEBUG: Generating STUNner authentication client
14:24:31.022023 main.go:95: turncat-cli DEBUG: Generating STUNner URI
14:24:31.022039 main.go:102: turncat-cli DEBUG: Starting turncat with STUNner URI: turn://k8s-xoul-udpgatew-77492a577e-37ff1616c9d9d844.elb.us-east-2.amazonaws.com:3478?transport=udp
14:24:31.026166 turncat.go:186: turncat INFO: Turncat client listening on file://stdin, TURN server: turn://k8s-xoul-udpgatew-77492a577e-37ff1616c9d9d844.elb.us-east-2.amazonaws.com:3478?transport=udp, peer: udp::9001
14:24:31.026184 main.go:118: turncat-cli DEBUG: Entering main loop
14:24:31.026229 turncat.go:227: turncat DEBUG: new connection from client /dev/stdin
14:24:31.026977 client.go:110: turnc DEBUG: Resolved STUN server 13.58.72.189:3478 to 13.58.72.189:3478
14:24:31.027007 client.go:119: turnc DEBUG: Resolved TURN server 13.58.72.189:3478 to 13.58.72.189:3478
14:24:38.834366 turncat.go:570: turncat WARNING: relay setup failed for client /dev/stdin: could not allocate new TURN relay transport for client file:/dev/stdin: all retransmissions failed for Yy8GqOLQqJk6g0Nb

Gateway API status

apiVersion: v1
items:
- apiVersion: gateway.networking.k8s.io/v1
  kind: Gateway
  metadata:
    annotations:
      meta.helm.sh/release-name: xoul-call
      meta.helm.sh/release-namespace: xoul
      service.beta.kubernetes.io/aws-loadbalancer-healthcheck-path: /live
      service.beta.kubernetes.io/aws-loadbalancer-healthcheck-port: "8086"
      service.beta.kubernetes.io/aws-loadbalancer-healthcheck-protocol: http
      stunner.l7mp.io/enable-mixed-protocol-lb: "true"
    creationTimestamp: "2024-05-26T16:34:27Z"
    generation: 1
    labels:
      app.kubernetes.io/managed-by: Helm
      helm.toolkit.fluxcd.io/name: call
      helm.toolkit.fluxcd.io/namespace: xoul
    name: udp-gateway
    namespace: xoul
    resourceVersion: "29553176"
    uid: 1d9eb8b1-58c5-4206-bbf2-b987500f5076
  spec:
    gatewayClassName: stunner-gatewayclass
    listeners:
    - allowedRoutes:
        namespaces:
          from: Same
      name: udp-listener
      port: 3478
      protocol: TURN-UDP
  status:
    addresses:
    - type: Hostname
      value: k8s-xoul-udpgatew-77492a577e-37ff1616c9d9d844.elb.us-east-2.amazonaws.com
    conditions:
    - lastTransitionTime: "2024-05-26T16:48:00Z"
      message: gateway accepted by controller stunner.l7mp.io/gateway-operator
      observedGeneration: 1
      reason: Accepted
      status: "True"
      type: Accepted
    - lastTransitionTime: "2024-05-28T18:33:16Z"
      message: dataplane configuration successfully rendered
      observedGeneration: 1
      reason: Programmed
      status: "True"
      type: Programmed
    listeners:
    - attachedRoutes: 1
      conditions:
      - lastTransitionTime: "2024-05-28T18:33:16Z"
        message: listener accepted
        observedGeneration: 1
        reason: Accepted
        status: "True"
        type: Accepted
      - lastTransitionTime: "2024-05-28T18:33:16Z"
        message: listener protocol-port available
        observedGeneration: 1
        reason: NoConflicts
        status: "False"
        type: Conflicted
      - lastTransitionTime: "2024-05-28T18:33:16Z"
        message: listener object references sucessfully resolved
        observedGeneration: 1
        reason: ResolvedRefs
        status: "True"
        type: ResolvedRefs
      name: udp-listener
      supportedKinds:
      - group: gateway.networking.k8s.io
        kind: UDPRoute
      - group: stunner.l7mp.io
        kind: UDPRoute
- apiVersion: stunner.l7mp.io/v1
  kind: GatewayConfig
  metadata:
    annotations:
      meta.helm.sh/release-name: xoul-call
      meta.helm.sh/release-namespace: xoul
    creationTimestamp: "2024-05-26T16:48:00Z"
    generation: 3
    labels:
      app.kubernetes.io/managed-by: Helm
      helm.toolkit.fluxcd.io/name: call
      helm.toolkit.fluxcd.io/namespace: xoul
    name: stunner-gatewayconfig
    namespace: xoul
    resourceVersion: "29550424"
    uid: a8202847-9c64-4f13-a1e0-334cdcdf5207
  spec:
    authRef:
      group: ""
      kind: Secret
      name: stunner-auth-secret
      namespace: xoul
    authType: plaintext
    dataplane: default
    logLevel: all:TRACE
    realm: stunner.l7mp.io
- apiVersion: gateway.networking.k8s.io/v1
  kind: GatewayClass
  metadata:
    annotations:
      meta.helm.sh/release-name: xoul-call
      meta.helm.sh/release-namespace: xoul
    creationTimestamp: "2024-05-26T16:48:00Z"
    generation: 1
    labels:
      app.kubernetes.io/managed-by: Helm
      helm.toolkit.fluxcd.io/name: call
      helm.toolkit.fluxcd.io/namespace: xoul
    name: stunner-gatewayclass
    resourceVersion: "28027020"
    uid: c46ec33c-2571-4182-a2a2-31278830c3e5
  spec:
    controllerName: stunner.l7mp.io/gateway-operator
    parametersRef:
      group: stunner.l7mp.io
      kind: GatewayConfig
      name: stunner-gatewayconfig
      namespace: xoul
  status:
    conditions:
    - lastTransitionTime: "2024-05-26T16:48:00Z"
      message: GatewayClass is now managed by controller "stunner.l7mp.io/gateway-operator"
      observedGeneration: 1
      reason: Accepted
      status: "True"
      type: Accepted
- apiVersion: stunner.l7mp.io/v1
  kind: UDPRoute
  metadata:
    annotations:
      meta.helm.sh/release-name: xoul-call
      meta.helm.sh/release-namespace: xoul
    creationTimestamp: "2024-05-28T18:28:08Z"
    generation: 1
    labels:
      app.kubernetes.io/managed-by: Helm
      helm.toolkit.fluxcd.io/name: call
      helm.toolkit.fluxcd.io/namespace: xoul
    name: media-plane
    namespace: xoul
    resourceVersion: "29553177"
    uid: 7a85af0e-6205-41fa-a6e1-36cac0694444
  spec:
    parentRefs:
    - group: gateway.networking.k8s.io
      kind: Gateway
      name: udp-gateway
    rules:
    - backendRefs:
      - group: ""
        kind: Service
        name: media-plane
        namespace: default
  status:
    parents:
    - conditions:
      - lastTransitionTime: "2024-05-28T18:33:16Z"
        message: parent accepts the route
        observedGeneration: 1
        reason: Accepted
        status: "True"
        type: Accepted
      - lastTransitionTime: "2024-05-28T18:33:16Z"
        message: all backend references successfully resolved
        observedGeneration: 1
        reason: ResolvedRefs
        status: "True"
        type: ResolvedRefs
      controllerName: stunner.l7mp.io/gateway-operator
      parentRef:
        group: gateway.networking.k8s.io
        kind: Gateway
        name: udp-gateway
kind: List
metadata:
  resourceVersion: ""

Operator logs

operator.log

rg0now commented 5 months ago

Just when you think you've seen all STUNner weirdness, you encounter this error:

14:24:09.140609 turncat.go:415: turncat DEBUG: cannot write to TURN relay connection for client /dev/stdin (likely harmless): invalid length of IP value

Anyway, there seem to be two semi-related issues here.

First, the peer address ($PEER_IP) does not seem to be defined in the invocation of turncat:

turncat - k8s://xoul/udp-gateway:udp-listener udp://${PEER_IP}:9001 -v
...
14:24:05.951221 turncat.go:186: turncat INFO: Turncat client listening on file://stdin, TURN server: turn://k8s-xoul-udpgatew-77492a577e-37ff1616c9d9d844.elb.us-east-2.amazonaws.com:3478?transport=udp, peer: udp::9001
...

Notice that the IP address is empty in the peer URI udp::9001, which causes the above weird error. What I don't quite understand here is that turncat tries to resolve the peer address before spawning the TURN client, but this somehow fails to catch the "empty IP address" case. I could reproduce this on my setup, so this is clearly a turncat bug.

The other issue is that for the subsequent invocations turncat cannot even create a TURN allocation:

14:24:38.834366 turncat.go:570: turncat WARNING: relay setup failed for client /dev/stdin: could not allocate new TURN relay transport for client file:/dev/stdin: all retransmissions failed for Yy8GqOLQqJk6g0Nb

Note that the PEER_IP is still empty, but it is beyond me why we get a working allocation for the first time and why we fail the second time. This will need further investigation. Can you please specify a valid peer IP and retry? How often you get a TURN allocation? Successful allocations are indicated by the below turncat log line:

14:24:06.027372 turncat.go:322: turncat INFO: new connection: client-address=/dev/stdin, relayed-address=10.0.139.119:46719
rg0now commented 5 months ago

With the latest master 512ce555 we at least warn if the peer IP is empty:

bin/turncat -l all:TRACE - k8s://stunner/udp-gateway:udp-listener udp://:5001
...
14:30:13.879997 turncat.go:100: turncat TRACE: Resolving peer address: udp://:5001
14:30:13.880026 main.go:122: turncat-cli ERROR: Could not init turncat: empty IP address in peer URL "udp://:5001"
drawnwren commented 5 months ago

Strangely, setting PEER_IP doesn't seem to do anything different.

(xoul-backend-py3.11) anubis :: Mono/services/xoul_backend » export PEER_IP=$(kubectl get svc media-plane -o jsonpath='{.spec.clusterIP}')
(xoul-backend-py3.11) anubis :: Mono/services/xoul_backend » turncat - k8s://xoul/udp-gateway:udp-listener udp://${PEER_IP}:9001 -v
11:13:46.490794 main.go:81: turncat-cli DEBUG: Reading STUNner config from URI "k8s://xoul/udp-gateway:udp-listener"
11:13:46.490861 main.go:163: turncat-cli DEBUG: Searching for CDS server
11:13:46.490868 k8s_client.go:154: cds-fwd DEBUG: Obtaining kubeconfig
11:13:46.494105 k8s_client.go:161: cds-fwd DEBUG: Creating a Kubernetes client
11:13:46.499706 k8s_client.go:196: cds-fwd DEBUG: Querying CDS server pods in namespace "<all>" using label-selector "stunner.l7mp.io/config-discovery-service=enabled"
11:13:47.257476 k8s_client.go:367: cds-fwd DEBUG: Found pod: xoul/stunner-gateway-operator-controller-manager-5d6c5cd8f8-c5vgr
11:13:47.257522 k8s_client.go:376: cds-fwd DEBUG: Creating a SPDY stream to API server using URL "https://CF330C06A766A5450BA821D9355D14DF.sk1.us-east-2.eks.amazonaws.com/api/v1/namespaces/xoul/pods/stunner-gateway-operator-controller-manager-5d6c5cd8f8-c5vgr/portforward"
11:13:47.257589 k8s_client.go:384: cds-fwd DEBUG: Creating a port-forwarder to pod
11:13:47.257602 k8s_client.go:400: cds-fwd DEBUG: Waiting for port-forwarder...
11:13:47.475529 k8s_client.go:419: cds-fwd DEBUG: Port-forwarder connected to pod xoul/stunner-gateway-operator-controller-manager-5d6c5cd8f8-c5vgr at 127.0.0.1:38535
11:13:47.475607 cds_api.go:215: cds-client DEBUG: GET: loading config for gateway xoul/udp-gateway from CDS server 127.0.0.1:38535
11:13:47.612799 main.go:88: turncat-cli DEBUG: Generating STUNner authentication client
11:13:47.612842 main.go:95: turncat-cli DEBUG: Generating STUNner URI
11:13:47.612867 main.go:102: turncat-cli DEBUG: Starting turncat with STUNner URI: turn://k8s-xoul-udpgatew-c8acc27fea-79a15cd11c4064f3.elb.us-east-2.amazonaws.com:3478?transport=udp
11:13:47.640750 turncat.go:186: turncat INFO: Turncat client listening on file://stdin, TURN server: turn://k8s-xoul-udpgatew-c8acc27fea-79a15cd11c4064f3.elb.us-east-2.amazonaws.com:3478?transport=udp, peer: udp:172.20.55.170:9001
11:13:47.640793 main.go:118: turncat-cli DEBUG: Entering main loop
11:13:47.640872 turncat.go:227: turncat DEBUG: new connection from client /dev/stdin
11:13:47.641525 client.go:110: turnc DEBUG: Resolved STUN server 10.0.155.225:3478 to 10.0.155.225:3478
11:13:47.641922 client.go:119: turnc DEBUG: Resolved TURN server 10.0.155.225:3478 to 10.0.155.225:3478
hello
11:13:55.449634 turncat.go:570: turncat WARNING: relay setup failed for client /dev/stdin: could not allocate new TURN relay transport for client file:/dev/stdin: all retransmissions failed for xm/656rhP17CPf/d
oh no
rip
dead
(xoul-backend-py3.11) anubis :: Mono/services/xoul_backend » turncat - k8s://xoul/udp-gateway:udp-listener udp://${PEER_IP}:9001 -v
11:14:41.997448 main.go:81: turncat-cli DEBUG: Reading STUNner config from URI "k8s://xoul/udp-gateway:udp-listener"
11:14:41.997523 main.go:163: turncat-cli DEBUG: Searching for CDS server
11:14:41.997533 k8s_client.go:154: cds-fwd DEBUG: Obtaining kubeconfig
11:14:41.998315 k8s_client.go:161: cds-fwd DEBUG: Creating a Kubernetes client
11:14:41.998546 k8s_client.go:196: cds-fwd DEBUG: Querying CDS server pods in namespace "<all>" using label-selector "stunner.l7mp.io/config-discovery-service=enabled"
11:14:42.641221 k8s_client.go:367: cds-fwd DEBUG: Found pod: xoul/stunner-gateway-operator-controller-manager-5d6c5cd8f8-c5vgr
11:14:42.641253 k8s_client.go:376: cds-fwd DEBUG: Creating a SPDY stream to API server using URL "https://CF330C06A766A5450BA821D9355D14DF.sk1.us-east-2.eks.amazonaws.com/api/v1/namespaces/xoul/pods/stunner-gateway-operator-controller-manager-5d6c5cd8f8-c5vgr/portforward"
11:14:42.641335 k8s_client.go:384: cds-fwd DEBUG: Creating a port-forwarder to pod
11:14:42.641349 k8s_client.go:400: cds-fwd DEBUG: Waiting for port-forwarder...
11:14:42.828647 k8s_client.go:419: cds-fwd DEBUG: Port-forwarder connected to pod xoul/stunner-gateway-operator-controller-manager-5d6c5cd8f8-c5vgr at 127.0.0.1:33303
11:14:42.828685 cds_api.go:215: cds-client DEBUG: GET: loading config for gateway xoul/udp-gateway from CDS server 127.0.0.1:33303
11:14:43.328980 main.go:88: turncat-cli DEBUG: Generating STUNner authentication client
11:14:43.328999 main.go:95: turncat-cli DEBUG: Generating STUNner URI
11:14:43.329007 main.go:102: turncat-cli DEBUG: Starting turncat with STUNner URI: turn://k8s-xoul-udpgatew-c8acc27fea-79a15cd11c4064f3.elb.us-east-2.amazonaws.com:3478?transport=udp
11:14:43.338316 turncat.go:186: turncat INFO: Turncat client listening on file://stdin, TURN server: turn://k8s-xoul-udpgatew-c8acc27fea-79a15cd11c4064f3.elb.us-east-2.amazonaws.com:3478?transport=udp, peer: udp:172.20.55.170:9001
11:14:43.338337 main.go:118: turncat-cli DEBUG: Entering main loop
11:14:43.338392 turncat.go:227: turncat DEBUG: new connection from client /dev/stdin
11:14:43.339075 client.go:110: turnc DEBUG: Resolved STUN server 10.0.25.18:3478 to 10.0.25.18:3478
11:14:43.339096 client.go:119: turnc DEBUG: Resolved TURN server 10.0.25.18:3478 to 10.0.25.18:3478
11:14:51.145965 turncat.go:570: turncat WARNING: relay setup failed for client /dev/stdin: could not allocate new TURN relay transport for client file:/dev/stdin: all retransmissions failed for ulxp0chTi0WIhmUh
drawnwren commented 5 months ago

I get the turncat listening log (main.go:186), but I don't think I've seen 322 yet.

rg0now commented 5 months ago

Actually one thing definitely changed: we no longer see the weird invalid length errors.

Anyway, the all retransmissions failed error is TURN's way of reporting that turncat cannot reach the TURN server (stunnerd) via the UDP load balancer. This often happens with EKS and it is usually completely independent from STUNner.

To verify this, can you please rerun the experiment, but this time with a higher stunnerd loglevel so that we see whether the TURN request hits pod?

  1. Set the loglevel to TRACE:
    kubectl -n xoul patch gatewayconfigs.stunner.l7mp.io stunner-gatewayconfig --type='merge' -p '{"spec":{"logLevel":"all:TRACE"}}'
  2. Repeat the experiment.

If you the log does not report any TURN requests then it is (again) an EKS issue. If this is the case then most probably you're missing a crucial annotation from the Gateway. Unfortunately, we cannot check this: we try to stay away from EKS these days.

drawnwren commented 5 months ago

So, that's something odd about this. This is with log level set to TRACE.

(xoul-backend-py3.11) anubis :: Mono/services/xoul_backend » kubectl -n xoul patch gatewayconfigs.stunner.l7mp.io stunner-gatewayconfig --type='merge' -p '{"spec":{"logLevel":"all:TRACE"}}'

gatewayconfig.stunner.l7mp.io/stunner-gatewayconfig patched (no change)
(xoul-backend-py3.11) anubis :: Mono/services/xoul_backend 130 » export PEER_IP=$(kubectl get svc media-plane -o jsonpath='{.spec.clusterIP}')
(xoul-backend-py3.11) anubis :: Mono/services/xoul_backend » turncat - k8s://xoul/udp-gateway:udp-listener udp://${PEER_IP}:9001 -v
12:24:53.525998 main.go:81: turncat-cli DEBUG: Reading STUNner config from URI "k8s://xoul/udp-gateway:udp-listener"
12:24:53.526127 main.go:163: turncat-cli DEBUG: Searching for CDS server
12:24:53.526138 k8s_client.go:154: cds-fwd DEBUG: Obtaining kubeconfig
12:24:53.527331 k8s_client.go:161: cds-fwd DEBUG: Creating a Kubernetes client
12:24:53.527583 k8s_client.go:196: cds-fwd DEBUG: Querying CDS server pods in namespace "<all>" using label-selector "stunner.l7mp.io/config-discovery-service=enabled"
12:24:54.204296 k8s_client.go:367: cds-fwd DEBUG: Found pod: xoul/stunner-gateway-operator-controller-manager-5d6c5cd8f8-7666h
12:24:54.204320 k8s_client.go:376: cds-fwd DEBUG: Creating a SPDY stream to API server using URL "https://CF330C06A766A5450BA821D9355D14DF.sk1.us-east-2.eks.amazonaws.com/api/v1/namespaces/xoul/pods/stunner-gateway-operator-controller-manager-5d6c5cd8f8-7666h/portforward"
12:24:54.204377 k8s_client.go:384: cds-fwd DEBUG: Creating a port-forwarder to pod
12:24:54.204392 k8s_client.go:400: cds-fwd DEBUG: Waiting for port-forwarder...
12:24:54.366990 k8s_client.go:419: cds-fwd DEBUG: Port-forwarder connected to pod xoul/stunner-gateway-operator-controller-manager-5d6c5cd8f8-7666h at 127.0.0.1:46113
12:24:54.367040 cds_api.go:215: cds-client DEBUG: GET: loading config for gateway xoul/udp-gateway from CDS server 127.0.0.1:46113
12:24:54.476585 main.go:88: turncat-cli DEBUG: Generating STUNner authentication client
12:24:54.476607 main.go:95: turncat-cli DEBUG: Generating STUNner URI
12:24:54.476614 main.go:102: turncat-cli DEBUG: Starting turncat with STUNner URI: turn://k8s-xoul-udpgatew-c8acc27fea-79a15cd11c4064f3.elb.us-east-2.amazonaws.com:3478?transport=udp
12:24:54.507352 turncat.go:227: turncat DEBUG: new connection from client /dev/stdin
12:24:54.507366 turncat.go:186: turncat INFO: Turncat client listening on file://stdin, TURN server: turn://k8s-xoul-udpgatew-c8acc27fea-79a15cd11c4064f3.elb.us-east-2.amazonaws.com:3478?transport=udp, peer: udp:172.20.55.170:9001
12:24:54.507425 main.go:118: turncat-cli DEBUG: Entering main loop
12:24:54.508126 client.go:110: turnc DEBUG: Resolved STUN server 10.0.155.225:3478 to 10.0.155.225:3478
12:24:54.508138 client.go:119: turnc DEBUG: Resolved TURN server 10.0.155.225:3478 to 10.0.155.225:3478
12:25:02.314866 turncat.go:570: turncat WARNING: relay setup failed for client /dev/stdin: could not allocate new TURN relay transport for client file:/dev/stdin: all retransmissions failed for 38jLwNy6T8qDPXOi
rg0now commented 5 months ago

No, this is the output of turncat (playing the role of a TURN client here), I'm curious of the logs from the other side, stunnerd (playing the role of the TURN server). The kubectl patch will update the log level on the server side, and then you can use kubectl -n xoul logs udp-gateway-XXXXX to see the logs from the stunnerd pod deployed by the operator automatically for your Gateway.

drawnwren commented 5 months ago
anubis :: basis/flux-charts/monitoring » k logs pod/udp-gateway-d6fd84b9-9b4q4 -n xoul
19:34:34.131328 main.go:83: stunnerd INFO: starting stunnerd id "xoul/udp-gateway", STUNner version v0.19.0-0-ge4fc45b0 (e4fc45b) built on 2024-05-28T17:19:01+0000
19:34:34.131392 main.go:126: stunnerd INFO: bootstrapping with minimal config
19:34:34.131399 main.go:145: stunnerd INFO: watching configuration at origin "http://10.0.133.122:13478"
19:34:34.131819 main.go:191: stunnerd INFO: new configuration available: "{version=\"v1\",admin:{name=\"xoul/udp-gateway\"},static-auth:{realm=\"stunner.l7mp.io\",username=\"<SECRET>\",password=\"<SECRET>\"},listeners=[],clusters=[]}"
19:34:34.132169 reconcile.go:112: stunner INFO: setting loglevel to "all:INFO"
19:34:34.132204 reconcile.go:140: stunner WARNING: running with no listeners
19:34:34.132212 reconcile.go:156: stunner WARNING: running with no clusters: all traffic will be dropped
19:34:34.132223 reconcile.go:176: stunner INFO: reconciliation ready: new objects: 2, changed objects: 0, deleted objects: 0, started objects: 0, restarted objects: 0
19:34:34.132238 reconcile.go:180: stunner INFO: new dataplane status: xoul/udp-gateway:{logLevel="all:INFO",health-check="http://:8086"}/static-auth:{realm="stunner.l7mp.io",username="<SECRET>",password="<SECRET>"}/[]/[]/allocs:0/status=READY
19:34:34.134169 cds_api.go:277: cds-client INFO: connection successfully opened to config discovery server at ws://10.0.133.122:13478/api/v1/configs/xoul/udp-gateway?watch=true
19:34:34.134863 main.go:191: stunnerd INFO: new configuration available: "{version=\"v1\",admin:{name=\"xoul/udp-gateway\",logLevel=\"all:TRACE\",health-check=\"http://:8086\"},static-auth:{realm=\"stunner.l7mp.io\",username=\"<SECRET>\",password=\"<SECRET>\"},listeners=[\"xoul/udp-gateway/udp-listener\":{turn://10.0.140.212:3478,public=k8s-xoul-udpgatew-f063ed2bd7-4733c622aa9a34b5.elb.us-east-2.amazonaws.com:3478,cert/key=-/-,routes=[xoul/media-plane]}],clusters=[\"xoul/media-plane\":{type=\"STATIC\",protocol=\"UDP\",endpoints=[10.0.137.251,172.20.55.170]}]}"
19:34:34.134934 reconcile.go:112: stunner INFO: setting loglevel to "all:TRACE"
19:34:34.134948 reconcile.go:93: auth-manager TRACE: finishing reconciliation
19:34:34.134953 reconcile.go:95: auth-manager TRACE: running the new-object job queue
19:34:34.134957 reconcile.go:109: auth-manager TRACE: running the deletion job queue
19:34:34.134960 reconcile.go:118: auth-manager TRACE: running the reconciliation job queue
19:34:34.134968 reconcile.go:121: auth-manager TRACE: reconciling object "default-auth-config": static-auth:{realm="stunner.l7mp.io",username="<SECRET>",password="<SECRET>"} -> static-auth:{realm="stunner.l7mp.io",username="<SECRET>",password="<SECRET>"}
19:34:34.134977 auth.go:56: stunner-auth DEBUG: using authentication: static
19:34:34.134983 reconcile.go:131: auth-manager DEBUG: reconciliation ready: to-be-created: 0, changed: 1, deleted: 0
19:34:34.134987 reconcile.go:93: listener-manager TRACE: finishing reconciliation
19:34:34.135013 reconcile.go:95: listener-manager TRACE: running the new-object job queue
19:34:34.135026 listener.go:61: stunner-listener-xoul/udp-gateway/udp-listener TRACE: NewListener: "xoul/udp-gateway/udp-listener":{turn://10.0.140.212:3478,public=k8s-xoul-udpgatew-f063ed2bd7-4733c622aa9a34b5.elb.us-east-2.amazonaws.com:3478,cert/key=-/-,routes=[xoul/media-plane]}
19:34:34.135033 listener.go:123: stunner-listener-xoul/udp-gateway/udp-listener TRACE: Reconcile: "xoul/udp-gateway/udp-listener":{turn://10.0.140.212:3478,public=k8s-xoul-udpgatew-f063ed2bd7-4733c622aa9a34b5.elb.us-east-2.amazonaws.com:3478,cert/key=-/-,routes=[xoul/media-plane]}
19:34:34.135039 manager.go:51: listener-manager TRACE: upsert object "xoul/udp-gateway/udp-listener"
19:34:34.135053 reconcile.go:109: listener-manager TRACE: running the deletion job queue
19:34:34.135056 reconcile.go:118: listener-manager TRACE: running the reconciliation job queue
19:34:34.135067 reconcile.go:131: listener-manager DEBUG: reconciliation ready: to-be-created: 1, changed: 0, deleted: 0
19:34:34.135071 reconcile.go:93: cluster-manager TRACE: finishing reconciliation
19:34:34.135074 reconcile.go:95: cluster-manager TRACE: running the new-object job queue
19:34:34.135089 cluster.go:50: stunner-cluster-xoul/media-plane TRACE: NewCluster: "xoul/media-plane":{type="STATIC",protocol="UDP",endpoints=[10.0.137.251,172.20.55.170]}
19:34:34.135103 cluster.go:76: stunner-cluster-xoul/media-plane TRACE: Reconcile: "xoul/media-plane":{type="STATIC",protocol="UDP",endpoints=[10.0.137.251,172.20.55.170]}
19:34:34.135117 manager.go:51: cluster-manager TRACE: upsert object "xoul/media-plane"
19:34:34.135123 reconcile.go:109: cluster-manager TRACE: running the deletion job queue
19:34:34.135130 reconcile.go:118: cluster-manager TRACE: running the reconciliation job queue
19:34:34.135136 reconcile.go:131: cluster-manager DEBUG: reconciliation ready: to-be-created: 1, changed: 0, deleted: 0
19:34:34.135154 server.go:28: stunner INFO: listener xoul/udp-gateway/udp-listener: [turn-udp://10.0.140.212:3478<0:0>] (re)starting
19:34:34.135164 handlers.go:66: stunner TRACE: NewPermissionHandler
19:34:34.135170 server.go:45: stunner INFO: setting up UDP listener socket pool at 0.0.0.0:3478 with 16 readloop threads
19:34:34.135291 handlers.go:17: stunner TRACE: NewAuthHandler
19:34:34.135365 server.go:166: stunner INFO: listener xoul/udp-gateway/udp-listener: TURN server running
19:34:34.135374 reconcile.go:176: stunner INFO: reconciliation ready: new objects: 2, changed objects: 2, deleted objects: 0, started objects: 1, restarted objects: 0
19:34:34.135378 auth.go:85: stunner-auth TRACE: GetConfig
19:34:34.135409 reconcile.go:180: stunner INFO: new dataplane status: xoul/udp-gateway:{logLevel="all:TRACE",health-check="http://:8086"}/static-auth:{realm="stunner.l7mp.io",username="<SECRET>",password="<SECRET>"}/["xoul/udp-gateway/udp-listener":{turn://10.0.140.212:3478,public=k8s-xoul-udpgatew-f063ed2bd7-4733c622aa9a34b5.elb.us-east-2.amazonaws.com:3478,cert/key=-/-,routes=[xoul/media-plane]}]/["xoul/media-plane":{type="STATIC",protocol="UDP",endpoints=[10.0.137.251,172.20.55.170]}]/allocs:0/status=READY
19:34:34.135416 main.go:201: stunnerd TRACE: reconciliation ready
rg0now commented 5 months ago

This settles it: the EKS LoadBalancer fails to forward turncat traffic to STUNner (otherwise, the logs would show some actual TURN activity). Please double-check that all the necessary EKS components are running, see all the necessary background here. Make sure all the necessary annotations are added to the Gateway, and all the necessary EKS components are running. Once you get a sample UDP application working, STUNner will also start to work.

rg0now commented 5 months ago

Sorry, I accidentally closed this. Reopening now.

rg0now commented 5 months ago

Closing this for good on preparing for v1, feel free to reopen if any new info comes up.