Document how to use static-ips for workload clusters

omniproc commented 3 years ago

Describe the solution you'd like VSphereMachineTemplate and VSphereMachine both seem to support providing static IP adresses. From older issues reported here it seems like a list of IPs could be provided to the VSphereMachineTemplate so all MachineDeployments use one of those IPs until none is left. However there is no documentation on how exactly to do this and at least with the latest vSphere Provider v0.7.9 running on v1alpha3 I wasn't able to create a working configuration.

VSphereMachineTemplate.infrastructure.cluster.x-k8s.io "capi" is invalid: spec.template.spec.network.devices.ipAddrs: Forbidden: cannot be set in templates is returned by clusterctl when trying to set ipAddrs and it's unclear how to edit the YAML generated by clusterctl so machines are deployed using static IPs.

Environment:

Cluster-api-provider-vsphere version: 0.7.9
Kubernetes version: 1.21.1
OS: Ubuntu 20.04.2 LTS

omniproc commented 3 years ago

Hm, ok. So besides the missing documentation I believe this is a bug.

https://github.com/kubernetes-sigs/cluster-api-provider-vsphere/blob/v0.7.9/api/v1alpha3/vspheremachinetemplate_webhook.go#L51-L54

Browsing through older issues it seems like this feature already was working some time ago so I'm not sure if this is a new bug or a missing feature.

omniproc commented 3 years ago

After some further investigation it seems like there's currently no way to specify static ips for the machines directly in the manifest used to deploy a workload cluster. It seems more like a current feature limitation (see the mentioned validator hook above).

So in order to use static ips the only method I was able to find so far is:

Use a VSphereMachineTemplate and configure dhcp4 and dhcp6 to false
This will cause any VSphereMachine CRD created from this template to wait until the VSphereMachine has its ipAddrs set
Use kubectl patch to edit the created VSphereMachine. Once it has it's ipAddrs the KubeadmControlPlane controller VM will be deployed in vSphere

It doesn't seem to be possible to create the VSphereMachineCRDs directly and tell the KubeadmControlPlane to use that object. Instead you have to provide a VSphereMachineTemplate as infrastructureTemplate in the KubeadmControlPlane which comes with the mentioned limitations.

So for now I'm using this to somewhat automate patching the CRD's to my needs:

# Declare the cluster name
declare cluster=cls01
# Declare clusterip and node ips
declare clusterip="172.16.0.30/32"
declare -a ips=("172.16.0.26/24" "172.16.0.27/24" "172.16.0.28/24" "172.16.0.29/24")

# Select controller VM
cvm=$(kubectl get VSphereMachine -l cluster.x-k8s.io/cluster-name=$cluster -o json | jq -r '.items[] | select(.metadata.ownerReferences[] | select(.kind=="KubeadmControlPlane")) | .metadata.name')
uid=$(kubectl get VSphereMachine $cvm --template '{{.metadata.uid}}{{"\n"}}')
echo $cvm

# Patch the controller VM to have a node ip and the clusterip
kubectl patch VSphereMachine $cvm --type=merge -p '{"spec":{"network":{"devices":[{"networkName": "lan", "gateway4": "172.16.0.1", "nameservers":["172.16.0.1"], "ipAddrs": ["'${ips[0]}'","'$clusterip'"]}]}}}'

# Get a list of all worker VMs
readarray -t vms < <(kubectl get VSphereMachine -l cluster.x-k8s.io/cluster-name=$cluster -o json | jq -r --arg uid "$uid" '.items[] | select(.metadata.uid!=$uid) | .metadata.name')

# Patch the controller VMs to have a node ip
count=1
for i in "${vms[@]}"
do
   kubectl patch VSphereMachine $i --type=merge -p '{"spec":{"network":{"devices":[{"networkName": "lan","gateway4": "172.16.0.1", "nameservers":["172.16.0.1"], "ipAddrs": ["'${ips[$count]}'"]}]}}}'
   (( count++ ))
done

I'd still leave this flagged as bug since the VSphereMachineTemplate definition clearly states that ipAddrs is a valid attribute but, as shown in the linked code above, the current VSphere Provider for ClusterAPI seems to just ignore that and throw an error.

nwoodmsft commented 3 years ago

@yastij I am also following this thread with interest and noticed you had assigned the issue to yourself for follow-up. Are you able confirm/comment on the findings of @omniproc ? Is there any method we are missing or is static ip support for workload clusters not present at this time? Thanks!

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

omniproc commented 2 years ago

/lifecycle frozen

srm09 commented 2 years ago

/help Now that the bug has been fixed, this can be something that can be picked up. @omniproc something you would wanna take a stab at?

k8s-ci-robot commented 2 years ago

@srm09: This request has been marked as needing help from a contributor.

Guidelines

Please ensure that the issue body includes answers to the following questions:

Why are we solving this issue?
To address this issue, are there any code changes? If there are code changes, what needs to be done in the code and what places can the assignee treat as reference points?
Does this issue have zero to low barrier of entry?
How can the assignee reach out to you for help?

For more details on the requirements of such an issue, please see here and ensure that they are met.

If this request no longer meets these requirements, the label can be removed by commenting with the /remove-help command.

In response to [this](https://github.com/kubernetes-sigs/cluster-api-provider-vsphere/issues/1215): >/help >Now that the bug has been fixed, this can be something that can be picked up. @omniproc something you would wanna take a stab at? Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.

srm09 commented 2 years ago

/kind documentation /remove-kind bug

omniproc commented 2 years ago

@srm09 I can write a docs draft for the quickstart.md on how to assign static IPs if that's what you're asking.

srm09 commented 2 years ago

That would be very helpful.😃😃

omniproc commented 2 years ago

Sure. I'll schedule some time for it this weekend. Will link the PR to this issue when done.

srm09 commented 2 years ago

/unassign @yastij /assign @omniproc

sathieu commented 2 years ago

Before writing a PR doc, could someone please provide a quick example here? :pray:

EDIT:

https://github.com/kubernetes-sigs/cluster-api-provider-vsphere/issues/1215#issuecomment-1025266432 said bug is fixed, but which bug?
is there another way to do this than the workaround in https://github.com/kubernetes-sigs/cluster-api-provider-vsphere/issues/1215#issuecomment-881689524

EDIT2:

Could a preKubeadmCommands be used?

omniproc commented 2 years ago

@sathieu I'm not sure but I doubt it since the preKubeadmCommand is executed after the node is reachable (thus, has a IP). The management cluster communicates the IPs to vSphere before the preKubeadmCommand is executed on the workload cluster for what I know (but as I said, I'm not sure - would need to test).

You might want to take a look at https://github.com/spectrocloud/cluster-api-provider-vsphere-static-ip or https://github.com/telekom/das-schiff/tree/ipam/ipam which provide custom controllers for a IPAM solution. It's the better approach and only requires you to deploy a custom controller on your management cluster. Under the hood pretty much the same happens as shown in the bash snipped above.

sathieu commented 2 years ago

Thanks @omniproc, I'll use the hack short term ; then setup a DHCP server.

kubernetes-sigs / cluster-api-provider-vsphere

Document how to use static-ips for workload clusters #1215

Guidelines