Closed dharapvj closed 2 years ago
Hi, The root cause for that behaviour - the provisioning of the LoadBalancer for the Ingress Controller fails - is due to the fact that the KubeOne example Terraform fro Azure creates a Azure LoadBalancer with Basic SKU, and with Basic SKU an Availability Set can ONLY belong to ONE LoadBalancer Backend Pool, and that's what caused the provisioning of the 2nd LB to fail.
A separate AvailabilitySet must be used for the Workers on Azure.
I've changed the Azure Terraform project to do just that, and also changed the SKU of the LB and Public IPs to Standard, as the Basic LB is very limited in functionality and performance.
@dharapvj
Can you provide here the KubeOne Manifest that you are testing for Azure?
I'm getting some strange errors during the KubeOne provisioning, mainly:
INFO[23:18:07 BST] Running kubeadm... node=##.##.###.###
WARN[23:23:31 BST] Task failed, error was: + export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/sbin:/usr/local/bin:/opt/bin
+ PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/sbin:/usr/local/bin:/opt/bin
+ [[ -f /etc/kubernetes/admin.conf ]]
+ sudo kubeadm init --config=./kubeone/cfg/master_0.yaml --ignore-preflight-errors=DirAvailable--var-lib-etcd,ImagePull
W0427 22:18:08.506231 4817 utils.go:69] The recommended value for "clusterDNS" in "KubeletConfiguration" is: [10.96.0.10]; the provided value is: [169.254.20.10]
W0427 22:18:08.510472 4817 configset.go:348] WARNING: kubeadm cannot validate component configs for API groups [kubelet.config.k8s.io kubeproxy.config.k8s.io]
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster
To see the stack trace of this error execute with --v=5 or higher
: Process exited with status 1
Hi @JoaquimFreitas
apiVersion: kubeone.io/v1beta1
kind: KubeOneCluster
versions:
kubernetes: '1.17.17'
cloudProvider:
azure: {}
Please note that here I am referring second availability set vj1-avset-worker
instead of original avset vj1-avset
as mentioned in my workaround.
ARM_TENANT_ID: "XXX"
ARM_CLIENT_ID: "ZZZ"
ARM_SUBSCRIPTION_ID: "YYY"
ARM_CLIENT_SECRET: "AAA"
cloudConfig: |
{
"tenantId": "XXX",
"subscriptionId": "YYY",
"aadClientId": "ZZZ",
"aadClientSecret": "AAA",
"resourceGroup": "vj1-rg",
"location": "westeurope",
"subnetName": "vj1-subnet",
"routeTableName": "",
"securityGroupName": "vj1-sg",
"vnetName": "vj1-vpc",
"primaryAvailabilitySetName": "vj1-avset-worker",
"useInstanceMetadata": true,
"useManagedIdentityExtension": false,
"userAssignedIdentityID": ""
}
@dharapvj Thanks for the info.
I was not using a credentials.yaml
, added it to my kubeone command.
Went back to Basic SKU LB in the Azure Terraform project - I was getting some strange errors during the kubeone execution with Standard SKU LB - and added the creation of a new AvailabilitySet for the Workers, and added a new NSG InboundRule for the KubeAPI - kubeone was falling later on during the provisioning due to that.
Also added new variables in the Terraform project to allow use of custom vNet and Subnet (the values were hardcoded in the Terraform code, with no need for that).
Also altered the Terraforn Output to indicate the new Workers AvailabilitySet in the cloudConfig.
apiVersion: kubeone.io/v1beta1
kind: KubeOneCluster
name: kubeone-k8scluster-azure
versions:
kubernetes: '1.19.9'
cloudProvider:
azure: {}
cloudConfig: |
{
"tenantId": "TTTTTTT-ID",
"subscriptionId": "SSSSSSS-ID",
"aadClientId": "CCCCCCC-ID",
"aadClientSecret": "CSECCSEC-ID",
"resourceGroup": "kubeone-k8scluster-azure-rg",
"location": "northeurope",
"subnetName": "kubeone-k8scluster-azure-subnet",
"routeTableName": "",
"securityGroupName": "kubeone-k8scluster-azure-nsg",
"vnetName": "kubeone-k8scluster-azure-vnet",
"primaryAvailabilitySetName": "kubeone-k8scluster-azure-pool1-avset",
"useInstanceMetadata": true,
"useManagedIdentityExtension": false,
"userAssignedIdentityID": ""
}
containerRuntime:
containerd: {}
Let me say that I thing that kubeone ONLY takes into account the output.tf
of the Terraform project, I've tested to put some different information of the cloudConfig
settings and they seemed to be just ignored...
With the changes, the KubeOne provisioning goes without major issues. And now a Azure LoadBalancer and a Public IP for it are provisioned when a Ingress Controller is deployed.
ONLY one question remains... What happens if MORE than ONE Ingress Controller is deployed....
Issues go stale after 90d of inactivity.
After a furter 30 days, they will turn rotten.
Mark the issue as fresh with /remove-lifecycle stale
.
If this issue is safe to close now please do so with /close
.
/lifecycle stale
Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten
.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close
.
/lifecycle rotten
/remove-lifecycle rotten
Issues go stale after 90d of inactivity.
After a furter 30 days, they will turn rotten.
Mark the issue as fresh with /remove-lifecycle stale
.
If this issue is safe to close now please do so with /close
.
/lifecycle stale
This was fixed. /close
@xmudrii: Closing this issue.
What happened: When I install nginx-ingress-controller on kubeone based cluster in Azure, the ingress URLs timeout.
What is the expected behavior: Ingress based app access should work.
How to reproduce the issue:
Anything else we need to know? I observed that if I
primaryAvailabilitySetName
to this new avset in kubeone.yaml ingress becomes available. But I do not know reason for this behavior. Also I am not sure the impact of changing value ofprimaryAvailabilitySetName
in kubeone.yamlInformation about the environment: KubeOne version (
kubeone version
):Operating system:
linux
Provider you're deploying cluster on:Azure
Operating system you're deploying on:Ubuntu