Ingress created by the controller is unreachable

paulxuca commented 1 year ago

The controller has created an ingress that looks like:

NAME                        CLASS    HOSTS                           ADDRESS          PORTS     AGE
some-name-9d572   <none>   someaddress.com   104.196.xxx.xx   80, 443   34m

I've ensured that the game server is running correctly but there is nothing running at the address specified by the ingress. Any suggestions on how I might be able to debug this issue?

Thanks in advance!

eddie-knight commented 1 year ago

Just another user chiming in here... Sorry to hear about the weird behavior. Perhaps this article will help by walking you through a few common debugging steps? If nothing else it'll add some more data points to rule out common issues.

https://medium.com/@ManagedKube/kubernetes-troubleshooting-ingress-and-services-traffic-flows-547ea867b120

danieloliveira079 commented 1 year ago

Hey @paulxuca this address is usually the IP of the load balancer. Additionally to the link Eddie shared you can check https://kubernetes.io/docs/concepts/services-networking/ingress/#types-of-ingress.

What do you see when you run kubectl -n $NAMESPACE get svc? The NAMESPACE should be the namespace where you are running your ingress controller responsible for routing traffic.

danieloliveira079 commented 1 year ago

O couple of things come to mind:

Are you using a wildcard DNS record dedicated for the domain exposing the game servers?
How are you testing connectivity?
Have you deployed Contour?

paulxuca commented 1 year ago

Thanks for the quick response both. Went through the Medium article that Eddie linked to no avail; To answer the above questions:

Output of kubectl -n $NAMESPACE get svc

NAME                             TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)          AGE
xxx-fleet-qlhzl-6fj8m   ClusterIP   xx.xx.xxx    <none>        7703/TCP         132m
xxx-fleet-qlhzl-9d572   ClusterIP  xx.xx.xxx     <none>        7404/TCP         133m
xxx-fleet-qlhzl-j57dv   ClusterIP   xx.xx.xxx    <none>        7136/TCP         132m

Using wildcard DNS record for the domain for game servers (Using path routing)
Testing connectivity by trying to dig/ nslookup the HOST issued (as well as directly trying to access the underlying IP address)

Contour is deployed; Although I am seeing warning logs:

[2023-08-08 17:19:19.270][1][warning][config] [./source/common/config/grpc_stream.h:196] StreamListeners gRPC config stream closed since 407s ago: 14, upstream connect error or disconnect/reset before headers. reset reason: connection failure, transport failure reason: TLS error: 268435581:SSL routines:OPENSSL_internal:CERTIFICATE_VERIFY_FAILED

Could that be related?

Thank you so much in advance for your help!

danieloliveira079 commented 1 year ago

How are you provisioning certificates for TLS? Where are you terminating TLS? At the game server, ingress controller or at the Cloud LoadBalancer?

danieloliveira079 commented 1 year ago

Testing connectivity by trying to dig/ nslookup the HOST issued (as well as directly trying to access the underlying IP address)

Trying to connect using the IP will not work. The ingress will only route traffic based on the host name. That is also important for TLS/HTTPS.

paulxuca commented 1 year ago

certificates for TLS is provisioned by cert-manager and terminated at the ingress controller level. Looking into that warning line further seems to point to envoy not being set up correctly; Will report back if that was indeed the issue. Thanks for your help thus far!

paulxuca commented 1 year ago

That was indeed the issue; thanks again for the help and for creating this project!

danieloliveira079 commented 1 year ago

No worries. Would you mind sharing more details about what fixed the problem. That way we can help others from the community. Thank you.

paulxuca commented 1 year ago

It seemed like the version of contour that was running was incompatible with the kubernetes version running on GKE (We have auto updates on) but there was no indication that there was an issue with contour besides the warning line (Being that it was a warning, we thought it could be ignored and would not interfere). Updating contour solved the issue.

Octops / gameserver-ingress-controller

Ingress created by the controller is unreachable #55