Azure / AKS

Azure Kubernetes Service
https://azure.github.io/AKS/
1.93k stars 297 forks source link

[BUG] LoadBalancer services use incorrect readiness logic with appProtocol=http #3646

Open howardjohn opened 1 year ago

howardjohn commented 1 year ago

Describe the bug Service YAML:

apiVersion: v1
kind: Service
metadata:
  name: server
spec:
  type: LoadBalancer
  selector:
    app: server
  ports:
    - name: http
      appProtocol: http
      port: 80

Per Kubernetes spec, endpoints of the service should be determined by the Endpoints/EndpointSlice health, which is determined by the pod's readinessProbe.

However, per https://cloud-provider-azure.sigs.k8s.io/topics/loadbalancer/#custom-load-balancer-health-probe, this is not the case for this service. Instead, HTTP requests to / will be sent to port 80 to determine health. This seems to violate the Service spec.

From a practical standpoint, this breaks usage of Istio Gateway-API, which create Services like this. However, we do not respond to GET / on port 80 (unless the user configures a route matching that, of course). If we do respond to it, we are likely forwarding it to some other service, which is fairly incorrect.

The end result is that users of Istio Gateway on Azure will see traffic not work due to failing health check. We can recommend users manually set a bunch of annotations to make things work, but this is tedious for users and leads to different usage for Azure vs other platforms. We could automatically detect + insert these for azure users, but we prefer to avoid vendor specific workarounds.

keithmattix commented 1 year ago

/cc @shashankbarsin

shashankbarsin commented 1 year ago

@feiskyer, @paulgmiller, @chasewilson

feiskyer commented 1 year ago

Please follow https://cloud-provider-azure.sigs.k8s.io/topics/loadbalancer/#custom-load-balancer-health-probe to set the correct request path with annotations if the default request path doesn't work.

howardjohn commented 1 year ago

Thanks @feiskyer. I am aware of how to fix it as a user. However, I am not a user but a maintainer of Istio (which, btw, now is shipped with Azure), and would like Istio to work out the box on Azure, as it does on other K8s platforms without relying on azure-specific annotations.

biefy commented 1 year ago

We are taking a look. Thanks @howardjohn for reporting this issue.

AndrewFarley commented 1 year ago

Just had this hit me as well and wasted about a day debugging specifically why.

I was experimenting with Kubernetes on Azure and tried my preferred ingress controller Ingress-Nginx and it would never work and there were no event logs or logs of any kind as to why this failure occurred just an inaccessibly not actually public Service.

Frustratingly, I eventually took a step back and tried a tutorial from Microsoft with Terraform here which seemed to work, eventually diff-ing and distilling it down to the fact that that controller and many modern controllers now specify this appProtocol=http by default. Thankfully, that controller has a flag to disable it so I can now use it without issue, but this is a fairly aggravating stain and blocker to adopt and utilize Azure AKS and alas has caused me wasted time and I'm sure will continue to cause others until this is fixed.

Please fix! At the very least just ignore this new option and don't completely block traffic, that should be a bare minimum.

Azure Kubernetes version (if it matters): 1.25.11

jtv8 commented 11 months ago

If it helps anyone, here's a signpost to the relevant source code in the provider.

Does this need a corresponding issue at kubernetes-sigs/cloud-provider-azure?

akshayjnambiar commented 7 months ago

Do we have any update on this issue?

biefy commented 6 months ago

We are testing an internal fix from AKS.

feiskyer commented 6 months ago

Update: a new shared LB health probe would be introduced in cloud provider azure v1.29.0: feat: support shared load balancer health probe mode. By setting clusterServiceLoadBalancerHealthProbeMode to shared, all cluster services will share one health probe targeting the kube-proxy port 10256 and /healthz by default. The health check port and path can be configured by clusterServiceSharedLoadBalancerHealthProbePort and clusterServiceSharedLoadBalancerHealthProbePort.

Refer https://github.com/kubernetes-sigs/cloud-provider-azure/pull/4891 for the details.

nathanweeks commented 3 months ago

Does this update to AKS k8s >= 1.28.5 imply the Gateway API should work with the Istio-based service mesh add-on for AKS as well? The docs currently mention the following limitation:

  • Gateway API for Istio ingress gateway or managing mesh traffic (GAMMA) are currently not yet supported with Istio addon.