Closed wizedkyle closed 3 months ago
Hi, thank you for trying. Could you pls share more error log?
Hey @jwtty what types of logs are you after? as all I can see in the controller is the above and on the azure side I see the failed attempts to update the resource.
Hi @wizedkyle, did you create the vmss BEFORE running the controller? You have to manually create the vmss as the gateway. I checked Azure log and it shows the vmss was not created before kube-egress-gateway-controller tried to update it.
BTW, aks integration with this feature is close to public preview. With aks integration, we are going to provision the vmss as an agentpool.
There is doc for creating the vmss beforehand: https://github.com/Azure/kube-egress-gateway/blob/main/docs/install.md#prerequisites
@jwtty The VMSS was created before the controller deployment using Terraform and it was created as a Kubernetes node pool.
Here is the Terraform showing what was provisioned:
resource "azurerm_kubernetes_cluster_node_pool" "egress_gateway" {
name = "egress"
kubernetes_cluster_id = azurerm_kubernetes_cluster.cluster.id
vm_size = "Standard_B4s_v2"
node_count = 2
enable_auto_scaling = false
os_sku = "Ubuntu"
vnet_subnet_id = azurerm_subnet.aks_cluster.id
node_labels = {
"kubeegressgateway.azure.com/mode" = "true"
}
node_taints = [
"kubeegressgateway.azure.com/mode=true:NoSchedule"
]
tags = local.common_tags
}
Hi @wizedkyle, I think I identified a bug in the code. Just for comfirmation, what value did you put in your "config.azureCloudConfig.resourceGroup" in the helm chart? And what value did you put in your staticGatewayConfiguration.spec.gatewayVmssProfile.vmssResourceGroup?
The issue I think is related to the incorrect resource group provided. My hypothesis is that you put the "MC" resource group of the AKS cluster in the staticGatewayConfiguration spec while the non-mc resource group in the cloud config. Please help me check. And if this is the case, to mitigate, you may put the "MC" resource group in the cloud config too. While I'm working on the bugfix.
Hey @jwtty in the helm chart for config.azureCloudConfig.resourceGroup" I put the resource group name that holds the AKS cluster resource and for the
staticGatewayConfiguration.spec.gatewayVmssProfile.vmssResourceGroup` I used the MC_ resource group name relating to the AKS cluster.
So based on that I would say your hypothesis is correct.
Cool, appreciate your confirmation. So for mitigation, please set the resourceGroup in the config to the MC_ rg and try. Meanwhile I already made the fix for the issue.
Please upgrade the images version to v0.0.14.
Hey all,
Sorry for the second issue, I have gotten past the auth issues in #644 by using App Registration client ID and secret (would still prefer to use managed identities). However, when the controller is reconciling the StaticGatewayConfiguration it is getting a 400 from Azure when trying to update the VMSS (I have redacted the resource group name and subscription ID):
Stack trace from the controller:
Looking at the latest API spec for VMSS Create or Update there is no reference to sku being a required parameter.