Closed velak4340 closed 6 months ago
Can you please provide more info? We'd need at least the output from kubectl get gateways,gatewayconfigs,gatewayclasses,udproutes.stunner.l7mp.io --all-namespaces -o yaml
, plus the logs from the operator and one of the stunnerd pods (if running) and anything you think important for tracking this down.
Hi @rg0now, Thanks for reply. Attached some information you requested. i don't see any stunnerd pods running... stunner-udp-gateway-log.txt stunner-gateway-operator-logs.txt stunnerconfig.txt
I can't see any apparent problem with your setup. Can you please elevate the loglevel on the gateway so that we see why the connection hangs and rerun the test? Here is a simple way to set the maximum loglevel:
kubectl -n stunner patch gatewayconfig stunner-gatewayconfig --type=merge -p '{"spec": {"logLevel": "all:TRACE"}}'
i updated the log level in stunner-gatewayconfig to trace and stunner-gateway-operator-controller-manager to debug before getting this logs... please check if helps stunner-gateway-operator-controller.txt
one thing i noticed is that LoadBalancer (Network LB) service exposes only UDP . should it expose TCP as well?. If so, how to do that.. service/udp-gateway LoadBalancer x.x.x.x ***** 3478:32616/UDP 3d3h
Unfortunately I'm no expert in AWS load-balancers, but you may be on the right track here: last time we looked at it AWS required a TCP health-checker to accept a UDP LoadBalancer. Can you experiment with the following annotations added to the Gateway?
stunner.l7mp.io/enable-mixed-protocol-lb: true
service.beta.kubernetes.io/aws-loadbalancer-healthcheck-port: "8086"
service.beta.kubernetes.io/aws-loadbalancer-healthcheck-protocol: "http"
service.beta.kubernetes.io/aws-loadbalancer-healthcheck-path: "/live"
What is strange is that the stunner-udp-gateway-log.txt actually shows a successful authentication attempt from someone (please check the source IP: is that one of your pods or it's coming from the outside?). Can you resend the stunner-udp-gateway-log.txt, but this time with the elevated loglevel?
I am also seeing this problem:
turncat -v - k8s://stunner/udp-gateway:udp-listener udp://${PEER_IP}:9001
:
08:41:32.460190 main.go:81: turncat-cli DEBUG: Reading STUNner config from URI "k8s://stunner/udp-gateway:udp-listener"
08:41:32.460296 main.go:163: turncat-cli DEBUG: Searching for CDS server
08:41:32.460312 k8s_client.go:154: cds-fwd DEBUG: Obtaining kubeconfig
08:41:32.461017 k8s_client.go:161: cds-fwd DEBUG: Creating a Kubernetes client
08:41:32.461312 k8s_client.go:196: cds-fwd DEBUG: Querying CDS server pods in namespace "<all>" using label-selector "stunner.l7mp.io/config-discovery-service=enabled"
08:41:32.488454 k8s_client.go:367: cds-fwd DEBUG: Found pod: stunner-system/stunner-gateway-operator-controller-manager-foo-bar
08:41:32.488604 k8s_client.go:376: cds-fwd DEBUG: Creating a SPDY stream to API server using URL "https://10.0.1.4:16443/api/v1/namespaces/stunner-system/pods/stunner-gateway-operator-controller-manager-foo-bar/portforward"
08:41:32.488725 k8s_client.go:384: cds-fwd DEBUG: Creating a port-forwarder to pod
08:41:32.488771 k8s_client.go:400: cds-fwd DEBUG: Waiting for port-forwarder...
08:41:32.516363 k8s_client.go:419: cds-fwd DEBUG: Port-forwarder connected to pod stunner-system/stunner-gateway-operator-controller-manager-foo-bar at 127.0.0.1:37641
08:41:32.516420 cds_api.go:215: cds-client DEBUG: GET: loading config for gateway stunner/udp-gateway from CDS server 127.0.0.1:37641
08:41:32.527517 main.go:88: turncat-cli DEBUG: Generating STUNner authentication client
08:41:32.527574 main.go:95: turncat-cli DEBUG: Generating STUNner URI
08:41:32.527591 main.go:102: turncat-cli DEBUG: Starting turncat with STUNner URI: turn://8.0.0.8:3478?transport=udp
08:41:32.527637 turncat.go:186: turncat INFO: Turncat client listening on file://stdin, TURN server: turn://8.0.0.8:3478?transport=udp, peer: udp:10.152.183.128:9001
08:41:32.527653 main.go:118: turncat-cli DEBUG: Entering main loop
08:41:32.527739 turncat.go:227: turncat DEBUG: new connection from client /dev/stdin
08:41:32.535533 client.go:110: turnc DEBUG: Resolved STUN server 8.0.0.8:3478 to 8.0.0.8:3478
08:41:32.535563 client.go:119: turnc DEBUG: Resolved TURN server 8.0.0.8:3478 to 8.0.0.8:3478
Can you please elevate the loglevel on the gateway so that we see why the connection hangs and rerun the test?
Just to make it clear: after elevating the stunnerd
loglevel to all:TRACE
, please repeat the turncat
test and post the logs from the stunnerd
pod, and not from the operator. The below would do it for the current setup:
kubectl -n stunner logs $(kubectl -n stunner get pod -l app=stunner -o jsonpath='{.items[0].metadata.name}')
This is because we need to see whether the connection request from turncat
has made it to stunnerd
(if not, then this is a LB issue), and if it did, then what happened to the connection after authentication. The last line of log we see above is this:
05:41:38.565349 handlers.go:25: stunner-auth INFO: static auth request: username="user-1" realm="stunner.l7mp.io" srcAddr=X.X.X.81:39986
We need to see what happened afterwards in the dataplane. Frankly, the whole thing is quite mysterious: if the authentication request were not successful then we would see that clearly in the logs, but if it was (like in our case), then why the client did not continue with establishing the connection? That's what the trace level logs would reveal (I hope).
One minor silly thing: after running turncat
, try to send something and press Enter, because turncat
waits on the standard input for data to be sent to the greeter. I guess you know that anyway, just to be absolutely sure.
Can you please elevate the loglevel on the gateway so that we see why the connection hangs and rerun the test?
Just to make it clear: after elevating the
stunnerd
loglevel toall:TRACE
, please repeat theturncat
test and post the logs from thestunnerd
pod, and not from the operator. The below would do it for the current setup:kubectl -n stunner logs $(kubectl -n stunner get pod -l app=stunner -o jsonpath='{.items[0].metadata.name}')
This is because we need to see whether the connection request from
turncat
has made it tostunnerd
(if not, then this is a LB issue), and if it did, then what happened to the connection after authentication. The last line of log we see above is this:05:41:38.565349 handlers.go:25: stunner-auth INFO: static auth request: username="user-1" realm="stunner.l7mp.io" srcAddr=X.X.X.81:39986
We need to see what happened afterwards in the dataplane. Frankly, the whole thing is quite mysterious: if the authentication request were not successful then we would see that clearly in the logs, but if it was (like in our case), then why the client did not continue with establishing the connection? That's what the trace level logs would reveal (I hope).
One minor silly thing: after running
turncat
, try to send something and press Enter, becauseturncat
waits on the standard input for data to be sent to the greeter. I guess you know that anyway, just to be absolutely sure.
Does turncat automatically add in credentials from the deployment, or do we have to add them in the udp connection string?
Theoretically, it should. It actually asks the operator for the running config of the gateway corresponding to the k8s://
URI so it should see up-to-date settings. It even generates its own ephemeral auth credential if that's what you've set. So try turncat
as above (without the auth credentials) and if you get an authentication error then that's a bug.
Closing this for now, feel free to reopen of anything new comes up.
Hi, I am tying to setup the example to make sure this solution can be used for my scenario and looks like all the pods/services are running fine. But when i test the example to make sure everything works fine, getting below error... I am running this example on AWS EKS cluster. any ideas what could be wrong??