Azure / AKS

Azure Kubernetes Service
https://azure.github.io/AKS/
1.95k stars 305 forks source link

Support Virtual WAN based topology for networkProfile/OutboundType #2334

Open guidola opened 3 years ago

guidola commented 3 years ago

What happened: When attempting to configure the egress setup for the AKS Cluster one is presented with two options:

Neither of these options allow to define the following configuration:

The current options would require to create a custom route table on the cluster subnet and configure there the rules that should live on the virtual wan hub. Loosing the point of having the hub whatsoever.

What you expected to happen:

A third option should exist so one can either define:

to tell AKS that the current vnet configuration is already properly configuring routing for the cluster or at least make the UDR outbound type not fail when using it without creating a custom route table in the cluster subnet.

How to reproduce it (as minimally and precisely as possible): Attempt to create a cluster with outboundType set to OutboundType.UserDefinedRouting without creating a route table for the cluster subnet. That will make the creation fail, when in a Virtual Wan setup there is no requirement to have a custom route table since no UDR needs to be configured for setting up routing.

ghost commented 3 years ago

Hi guidola, AKS bot here :wave: Thank you for posting on the AKS Repo, I'll do my best to get a kind human from the AKS team to assist you.

I might be just a bot, but I'm told my suggestions are normally quite good, as such: 1) If this case is urgent, please open a Support Request so that our 24/7 support team may help you faster. 2) Please abide by the AKS repo Guidelines and Code of Conduct. 3) If you're having an issue, could it be described on the AKS Troubleshooting guides or AKS Diagnostics? 4) Make sure your subscribed to the AKS Release Notes to keep up to date with all that's new on AKS. 5) Make sure there isn't a duplicate of this issue already reported. If there is, feel free to close this one and '+1' the existing issue. 6) If you have a question, do take a look at our AKS FAQ. We place the most common ones there!

ghost commented 3 years ago

Triage required from @Azure/aks-pm

ghost commented 3 years ago

Action required from @Azure/aks-pm

ghost commented 3 years ago

Issue needing attention of @Azure/aks-leads

jjindrich commented 3 years ago

Any updates please ?

justindavies commented 3 years ago

I'll move this to feature request and we'll keep you updated on where we get to.

erjosito commented 3 years ago

FWIW, this feature would be required for ESLZ deployments leveraging Virtual WAN. Another use case is when using routes advertised by NVAs via Azure Route Server.

ghost commented 2 years ago

Action required from @Azure/aks-pm

ghost commented 2 years ago

Issue needing attention of @Azure/aks-leads

phealy commented 2 years ago

@guidola The proper way to address this scenario is to create the cluster with outbound type UDR with an empty route table that has propagate gateway routes enabled. I agree that another scenario may be desirable to skip the SLB outbound rules without requiring the UDR checks. I'll mark this as a feature request.

erjosito commented 2 years ago

@phealy last time I tried that it didnt work, AKS wanted a 0.0.0.0/0 route in the RT. I havent tried it, but I have heard from others that the trick is creating an UDR for 0.0.0.0/0 with next-hop None, which will be overriden by the route injected by VWAN/ARS. Not sure if the behaviour has changed in the meantime.

tkubica12 commented 2 years ago

@phealy any progress on this, please?

Workaround with UDR 0.0.0.0/0 is problematic with enterprise customers who standardized on vWAN (which is network topology preferred by many Microsoft documents) due to policies. Regular admins should have freedom to create and modify subnets yet all outbound must be routed via firewall so in classic hub-and-spoke this leads to VNETs being locked by network teams so admins lose all flexibility. vWAN allows to push 0.0.0.0/0 centrally so with no UDR traffic is still forced via secure path therefore admins can have freedom of modifying subnets except for adding UDRs (that might reconfigure routing to bypass firewall).

Therefore with vWAN you often do not allow UDR use... That is why we need outboundType compatible with vWAN, I think.

WaitingForGuacamole commented 2 years ago

Any updates on this? First, that a feature request is in the backlog and scheduled for work, and Second, that the workaround to specify a dummy route is valid, and doesn't break other routing? It appears you have to supply a VirtualAppliance route with next hop set to the firewall private address.

WaitingForGuacamole commented 2 years ago

@phealy last time I tried that it didnt work, AKS wanted a 0.0.0.0/0 route in the RT. I havent tried it, but I have heard from others that the trick is creating an UDR for 0.0.0.0/0 with next-hop None, which will be overriden by the route injected by VWAN/ARS. Not sure if the behaviour has changed in the meantime.

It's possible that worked in the past. If it did, that's been shut down.

phealy commented 2 years ago

@phealy last time I tried that it didnt work, AKS wanted a 0.0.0.0/0 route in the RT. I havent tried it, but I have heard from others that the trick is creating an UDR for 0.0.0.0/0 with next-hop None, which will be overriden by the route injected by VWAN/ARS. Not sure if the behaviour has changed in the meantime.

It's possible that worked in the past. If it did, that's been shut down.

Apologies, I've been swamped. The workaround requires two factors: first, it has to be a private cluster. Second, the route table must still be created and attached to the subnet, but it can be empty.

Both public and private clusters will check that the route table has been associated for outbound type userDefinedRouting, but a private cluster can bypass the need for the 0.0.0.0/0 route.

WaitingForGuacamole commented 2 years ago

@phealy, thanks for responding. I can confirm that AKS will provision with userDefinedRouting outbound type so long as I give it an empty UDR. I can even contact my application, so that's a win.

The problem seems to be that health probes don't resolve - I've ensured that the AG subnet opens up the required ports for a v2 SKU, but I think the real problem is that I don't have an outbound default route to Internet.

What concerns me about that is a paragraph in https://docs.microsoft.com/en-us/azure/cloud-adoption-framework/scenarios/app-platform/aks/network-topology-and-connectivity#design-recommendations, which states:

If your security policy mandates inspecting all outbound internet traffic generated in the AKS cluster, secure egress network traffic using Azure Firewall or a third-party network virtual appliance (NVA) deployed in the managed hub virtual network. For more information, see Limit egress traffic. The AKS outbound type UDR requires associating a route table to the AKS node subnet, so it cannot be used today with the dynamic route injection supported by Azure Virtual WAN or Azure Route Server.

The overall Azure footprint and framework evolve so rapidly that I don't know if that statement is true, or if the documentation is just behind the curve, which would be understandable.

UPDATE: Adding a default route out to internet resolved my health probe issues. Now my cluster doesn't reply through the firewall.

UPDATE 2: Looks like that outbound route might have broken the AGIC trying to modify the app gateway with listener changes. More on that later if I can get data to prove it.

erjosito commented 2 years ago

Hey @WaitingForGuacamole , that CAF document should be read as "...it cannot be used today with the dynamic route injection supported by Azure Virtual WAN or Azure Route Server without using weird workarounds". We dont want to document hacks such as adding 0.0.0.0/0 routes pointing to None, especially on a best practices document.

WaitingForGuacamole commented 2 years ago

Hey @WaitingForGuacamole , that CAF document should be read as "...it cannot be used today with the dynamic route injection supported by Azure Virtual WAN or Azure Route Server without using weird workarounds". We dont want to document hacks such as adding 0.0.0.0/0 routes pointing to None, especially on a best practices document.

I appreciate that perspective, and thank you for that response. It's perhaps a bit frustrating, as it's either that same document or another one out there that tells you it's OK to put AKS into a VWAN hub/spoke setting. OK? Yes. OK and get a fully private cluster only accessible through the firewall? Not so much.

phealy commented 2 years ago

@WaitingForGuacamole To add some clarification here - AKS supports fully private clusters with forced tunnelling (0.0.0.0/0 routes) and works well with them. You just need to make sure that you don't have asymmetric routes - i.e. you can't use public IP services with a 0.0.0.0/0 route, you need to use ILB services and DNAT the traffic from something that does have direct internet visibility.

With vWAN in the picture, you could use a secured virtual hub, then use Azure Firewall to provide the DNAT rules to direct traffic from public IPs (attached to Azure Firewall) into an AKS cluster's ILB service IP.

What I think you're getting tripped up on is Application Gateway - while AGIC is a supported add-on to AKS, Application Gateway itself is not part of AKS, and Application Gateway v2 SKUs do not support any situation where 0.0.0.0/0 is routed to a destination other than internet. This means, unfortunately, that vWAN default route + AppGw/AGIC + Kubenet is not going to be a functional scenario, as you get into a conflicting situation, where Application Gateway requires the 0.0.0.0/0 route to be disabled while you want it on for AKS. Most of the customers I've worked with in this situation solve this in one of two ways:

  1. Use Azure CNI (ideally with Dynamic Pod IP allocation) so that AKS doesn't need to modify a route table - using private clusters + outboundType UDR + empty route table workaround as discussed above. This allows you to configure Application Gateway's route table with "disable gateway route propagation" and use it as a DMZ/WAF by having it have public IPs.
  2. Use an alternative ingress controller that deploys inside AKS and DNAT traffic to it via AFW or an NVA and an ILB service.
WaitingForGuacamole commented 2 years ago

So, I have been using CNI. Kubenet has never been part of the configuration. I do have a private cluster with internal load balancing (outbound_type set to userDefinedRouting, all services annotated for AGIC ingress on private IP, load balancer on private IP.

What I've found is that, with an empty default route table, I can reach my cluster and the application works. Health probes do NOT work in that configuration. Adding the default route to Internet fixes that problem, but then the AGIC component doesn't work, nodes can't download images, etc.

EDIT: Also, I don't want to use AGIC as a WAF in a DMZ sort of setting, as the sort of obviates the need for the secure virtual hub that the cluster resides in?

ALSO: Thanks very much for the prompt, thoughtful response. It's greatly appreciated.

WaitingForGuacamole commented 2 years ago

I now have something somewhat working, although I want to tear it down and rebuild it to confirm.

I created an empty route table with "propagate gateway routes" set to Yes, and associated with my four node pool subnets.

I created a route table with "propagate gateway routes" set to No, and associated only with my application gateway subnet.

It seems to work, but it's a really messy solution when laid over a VWAN deployment. I may well recommend we go with an internal ingress like Traefik.

Thanks for all the input.

EDIT: Here's a sample of the implementation in Terraform:

# Route table to keep "userDefinedRouting" network_profile option happy,
# assigned to all node pool subnets, propagating gateway routes,
# no default route (allow VWAN to inject)
resource "azurerm_route_table" "aks-empty-route" {
  name                          = "rt-${local.project_environment}-application"
  location                      = var.location
  resource_group_name           = var.resource_group_name
  disable_bgp_route_propagation = false

  tags = var.tags
}

resource "azurerm_subnet_route_table_association" "aks-nodepool-route" {
  for_each       = { for k, p in var.node_pools : k => p }
  subnet_id      = lookup(each.value, "subnet_id", local.defaults["subnet_id"])
  route_table_id = azurerm_route_table.aks-empty-route.id
}

# Route table to keep Application Gateway happy,
# assigned to application gateway subnet, not propagating gateway routes,
# default route to Internet
resource "azurerm_route_table" "aks-appgateway-route" {
  name                          = "rt-${local.project_environment}-application-gateway"
  location                      = var.location
  resource_group_name           = var.resource_group_name
  disable_bgp_route_propagation = true

  route {
    name           = "default"
    address_prefix = "0.0.0.0/0"
    next_hop_type  = "Internet"
  }

  tags = var.tags
}

resource "azurerm_subnet_route_table_association" "aks-appgateway-route" {
  subnet_id      = var.application_gateway_subnet_id
  route_table_id = azurerm_route_table.aks-appgateway-route.id
}

The azurerm_kubernetes_cluster that follows this has a depends_on block referencing both of the route tables.

ghost commented 1 year ago

Action required from @Azure/aks-pm

ghost commented 1 year ago

Issue needing attention of @Azure/aks-leads

ghost commented 1 year ago

Issue needing attention of @Azure/aks-leads

ghost commented 1 year ago

Issue needing attention of @Azure/aks-leads

erjosito commented 1 year ago

Since nothing happened here I blogged about the workaround in https://blog.cloudtrooper.net/2023/01/10/filtering-aks-egress-traffic-with-virtual-wan/

miwithro commented 1 year ago

@phealy to investigate.

petetian commented 1 year ago

Up vote. The new feature makes a lot of sense for enterprise users who use route server in VNETs.

zadigus commented 7 months ago

in my case, where I don't have an application gateway, creating the route table like this (and removing it after the AKS deployment has completed) gets around the issue:

resource "azurerm_route_table" "aks" {
  name                          = "rt-aks"
  location                      = var.location
  resource_group_name           = var.resource_group_name
  disable_bgp_route_propagation = false

  route {
    name                   = "default"
    address_prefix         = "0.0.0.0/0"
    next_hop_type          = "VirtualAppliance"
    next_hop_in_ip_address = var.fw_ip
  }
}

resource "azurerm_subnet_route_table_association" "aks" {
  depends_on = [azurerm_route_table.aks]

  subnet_id      = var.workers_subnet_id
  route_table_id = azurerm_route_table.aks.id
}

It is simpler than this proposition.

Creating that route table feels overkill and unnecessary. However, without it, I am not able to get a fully deployed AKS cluster.

erjosito commented 7 months ago

Hey @zadigus, I think this is pretty much what I documented in this blog post. A user commented that when upgrading the AKS version the preflight check for the route table would be performed again. I didn't test that, but you might want to leave the route table applied just in case.

BTW, I agree that it is an overkill and unnecessary :-(

WaitingForGuacamole commented 7 months ago

So, I agree with all comments regarding the distaste for my workaround, as well as its necessity. @zadigus, what I found was that getting rid of the route table caused all health probes to fail. Putting it back caused it to start working again.

There may no longer be a need for it, however. There is a public preview feature for Application Gateway v2 which allows you to not use a public IP. In removing this, your firewall would be responsible for getting traffic to it exclusively by DNAT. The important thing about it is that in removing the second entry point, you remove the possibility of an asymmetric route. If all traffic has to traverse the firewall, you don't have the possibility of platform traffic bypassing it.

It may also remove the need for the empty route table for the node pools, but am not completely sure.

Are you in a position where you can take some downtime, drop the application gateway and cluster, and redeploy (because that's what will be needed to get an app gateway with that preview feature).

For reference, the feature is documented in https://learn.microsoft.com/en-us/azure/application-gateway/application-gateway-private-deployment?tabs=portal

zadigus commented 7 months ago

Hey @zadigus, I think this is pretty much what I documented in this blog post. A user commented that when upgrading the AKS version the preflight check for the route table would be performed again. I didn't test that, but you might want to leave the route table applied just in case.

BTW, I agree that it is an overkill and unnecessary :-(

@erjosito I don't care about that (potential) problem, because every update I do is done through my terraform build, which rebuilds the route table first, then creates / updates the AKS cluster, then removes the route table. But still, it's worth knowing that subsequent updates without the route table might not work, thanks for the information.

zadigus commented 7 months ago

@WaitingForGuacamole you may be right; however, I have a pretty extensive (yet not exhaustive) battery of acceptance tests validating that my infrastructure works as expected, and with the route table deleted after AKS cluster deployment, everything works like a charm

MaxAnderson95 commented 6 months ago

Microsoft's documentation already lists the userDefinedRouting option as "advanced". They shouldn't require any sort of route table, and just leave it up to us as IT professionals to do it correctly. Display a warning or something but don't outright prevent it. Very annoying!