giantswarm / roadmap

Giant Swarm Product Roadmap
https://github.com/orgs/giantswarm/projects/273
Apache License 2.0
3 stars 0 forks source link

Introduce hardened Flatcar Images for CAPZ #1659

Closed Rotfuks closed 1 year ago

Rotfuks commented 1 year ago

Motivation

Currently we use Ubuntu images for our cluster nodes. But those are not specially hardened and thus not really secure. We have a more secure alternative with the hardened flatcar images. We therefor need to replace the ubuntu images with flatcar ones.

Todo

Open Upstream Issues

Outcome

Technical Hint

Rotfuks commented 1 year ago

Dependent on https://github.com/giantswarm/giantswarm/issues/24566

primeroz commented 1 year ago

something to keep an eye on https://github.com/kubernetes-sigs/cluster-api-provider-azure/pull/2890 is adding a template for using flatcar on capz

Rotfuks commented 1 year ago

Flatcar now officially in the docs: https://capz.sigs.k8s.io/topics/flatcar.html

primeroz commented 1 year ago

Flatcar now officially in the docs: https://capz.sigs.k8s.io/topics/flatcar.html

I am getting a 404 here now :shrug: :)

for reference the file still exists here https://github.com/kinvolk/cluster-api-provider-azure/blob/8dad8f074688f1790b08a185ed0a33a6bcf3fd4b/docs/book/src/topics/flatcar.md

Rotfuks commented 1 year ago

Ah yeah sorry, that was because of the recent change to point the documentation no longer to the newest release branch, but the main branch of the capz book. So it will be there again once the new release is done or once we introduce the multi-version documentation in capz upstream :)

primeroz commented 1 year ago

First test at least nodes joined :+1:

NAME                                                                             READY  SEVERITY  REASON  SINCE  MESSAGE                              
Cluster/fctest1                                                                  True                     6m25s                                                                                                    
├─ClusterInfrastructure - AzureCluster/fctest1                                   True                     8m53s                                                           
├─ControlPlane - KubeadmControlPlane/fctest1                                     True                     6m25s                                                                                    
│ └─Machine/fctest1-zzrhf                                                        True                     6m26s                                                                                    
│   ├─BootstrapConfig - KubeadmConfig/fctest1-jnd99                              True                     8m49s                                                                                    
│   └─MachineInfrastructure - AzureMachine/fctest1-control-plane-c17c01d5-zzxbh  True                     6m26s                                                                                                    
└─Workers                                                                                                                                                                                          
  ├─MachineDeployment/fctest1-bastion                                            True                     10m                                                                                      
  │ └─Machine/fctest1-bastion-868b7dcb67-tz7rc                                   True                     4s                                                                                                       
  │   ├─BootstrapConfig - KubeadmConfig/fctest1-bastion-973fd873-fkttq           True                     6m22s                                                                                                    
  │   └─MachineInfrastructure - AzureMachine/fctest1-bastion-836b66f0-hf7kl      True                     4s                                                                                                       
  └─MachineDeployment/fctest1-md00                                               True                     19s                                                                                                      
    ├─Machine/fctest1-md00-77c9d6f645-68l7m                                      True                     114s                                                                                                     
    │ ├─BootstrapConfig - KubeadmConfig/fctest1-md00-ad5e9669-qlxdd              True                     6m22s                                                                                                    
    │ └─MachineInfrastructure - AzureMachine/fctest1-md00-bcb876fb-l69vx         True                     114s                                                                                                     
    ├─Machine/fctest1-md00-77c9d6f645-8pn6k                                      True                     4m12s                                                                                                    
    │ ├─BootstrapConfig - KubeadmConfig/fctest1-md00-ad5e9669-v6m92              True                     6m21s                                                                                                    
    │ └─MachineInfrastructure - AzureMachine/fctest1-md00-bcb876fb-r2fc9         True                     4m12s                                                                                                    
    └─Machine/fctest1-md00-77c9d6f645-dcdlx                                      True                     3m25s                                                                                                    
      ├─BootstrapConfig - KubeadmConfig/fctest1-md00-ad5e9669-4jbmz              True                     6m21s                                                                                                    
      └─MachineInfrastructure - AzureMachine/fctest1-md00-bcb876fb-xlm6q         True                     3m25s                                                                                                    
NAME                                   STATUS   ROLES           AGE     VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                                             KERNEL-VERSION      CONTAINER-RUNTIME
fctest1-control-plane-c17c01d5-zzxbh   Ready    control-plane   7m19s   v1.24.9   10.0.0.4      <none>        Ubuntu 20.04.5 LTS                                   5.15.0-1029-azure   containerd://1.6.2
fctest1-md00-bcb876fb-l69vx            Ready    <none>          2m46s   v1.24.9   10.0.16.4     <none>        Flatcar Container Linux by Kinvolk 3374.2.1 (Oklo)   5.15.77-flatcar     containerd://1.6.14
fctest1-md00-bcb876fb-r2fc9            Ready    <none>          4m59s   v1.24.9   10.0.16.6     <none>        Flatcar Container Linux by Kinvolk 3374.2.1 (Oklo)   5.15.77-flatcar     containerd://1.6.14
fctest1-md00-bcb876fb-xlm6q            Ready    <none>          4m29s   v1.24.9   10.0.16.5     <none>        Flatcar Container Linux by Kinvolk 3374.2.1 (Oklo)   5.15.77-flatcar     containerd://1.6.14
primeroz commented 1 year ago

Machine Review

primeroz commented 1 year ago

License for flatcar is Apache License 2.0 which , to my understanding, let us free to modify and redistribute the images as long as

https://github.com/flatcar/flatcar-docs/blob/main/LICENSE

primeroz commented 1 year ago

Cilium looks ok but 4 tests are failing from the connectivty test suite

TLDR: I think the issue is with the test itself

But i tried with latest version of cilium-cli and i am still getting the error for some tests - do we also need cilium 1.13 ?


📋 Test Report
❌ 4/29 tests failed (6/230 actions), 2 tests skipped, 1 scenarios skipped:
Test [to-entities-world]:
  ❌ to-entities-world/pod-to-world/http-to-one-one-one-one-0: cilium-test/client-755fb678bd-4r6pg (192.168.2.121) -> one-one-one-one-http (one.one.one.one:80)
  ❌ to-entities-world/pod-to-world/http-to-one-one-one-one-1: cilium-test/client2-5b97d7bc66-nxl76 (192.168.2.210) -> one-one-one-one-http (one.one.one.one:80)
Test [client-egress-l7]:
  ❌ client-egress-l7/pod-to-world/http-to-one-one-one-one-0: cilium-test/client2-5b97d7bc66-nxl76 (192.168.2.210) -> one-one-one-one-http (one.one.one.one:80)
Test [client-egress-l7-named-port]:
  ❌ client-egress-l7-named-port/pod-to-world/http-to-one-one-one-one-0: cilium-test/client2-5b97d7bc66-nxl76 (192.168.2.210) -> one-one-one-one-http (one.one.one.one:80)
Test [to-fqdns]:
  ❌ to-fqdns/pod-to-world/http-to-one-one-one-one-0: cilium-test/client-755fb678bd-4r6pg (192.168.2.121) -> one-one-one-one-http (one.one.one.one:80)
  ❌ to-fqdns/pod-to-world/http-to-one-one-one-one-1: cilium-test/client2-5b97d7bc66-nxl76 (192.168.2.210) -> one-one-one-one-http (one.one.one.one:80)
connectivity test failed: 4 tests failed
[=] Test [to-entities-world]                                                                                                                                                                                       
.                                                                                                                                                                                                                  
  ℹ️  📜 Applying CiliumNetworkPolicy 'client-egress-to-entities-world' to namespace 'cilium-test'..                                                                                                                
  [-] Scenario [to-entities-world/pod-to-world]                                                                                                                                                                    
  [.] Action [to-entities-world/pod-to-world/http-to-one-one-one-one-0: cilium-test/client-755fb678bd-4r6pg (192.168.2.121) -> one-one-one-one-http (one.one.one.one:80)]                                          
  ❌ command "curl -w %{local_ip}:%{local_port} -> %{remote_ip}:%{remote_port} = %{response_code} --silent --fail --show-error --connect-timeout 5 --output /dev/null http://one.one.one.one:80" failed: command te
rminated with exit code 28
  ℹ️  curl output:
  curl: (28) Resolving timed out after 5000 milliseconds
:0 -> :0 = 000

DNS issues ?

/ # ping one.one.one.one
PING one.one.one.one (1.0.0.1) 56(84) bytes of data.
/ # dig @192.168.1.99 -p 1053 one.one.one.one +short
1.1.1.1
1.0.0.1
/ # dig @192.168.1.181 -p 1053 one.one.one.one +short
1.1.1.1
1.0.0.1
/ # dig @192.168.0.228 -p 1053 one.one.one.one +short
1.1.1.1
1.0.0.1

/ # dig @172.31.0.10 -p 53 one.one.one.one +short
1.0.0.1
1.1.1.1

Running the command myself from the pod works just fine

/ # while true                                                                                            
> do                                                                                                      
> curl -w "%{local_ip}:%{local_port} -> %{remote_ip}:%{remote_port} = %{response_code}" --silent --fail --show-error --connect-timeout 5 --output /dev/null http://one.one.one.one:80; echo " - $?"
> done                                     
192.168.2.121:40672 -> 1.1.1.1:80 = 301 - 0
192.168.2.121:55990 -> 1.0.0.1:80 = 301 - 0
192.168.2.121:40680 -> 1.1.1.1:80 = 301 - 0
192.168.2.121:40688 -> 1.1.1.1:80 = 301 - 0
192.168.2.121:55998 -> 1.0.0.1:80 = 301 - 0
192.168.2.121:40692 -> 1.1.1.1:80 = 301 - 0
192.168.2.121:56000 -> 1.0.0.1:80 = 301 - 0
192.168.2.121:40694 -> 1.1.1.1:80 = 301 - 0
192.168.2.121:40700 -> 1.1.1.1:80 = 301 - 0
192.168.2.121:40702 -> 1.1.1.1:80 = 301 - 0
192.168.2.121:40716 -> 1.1.1.1:80 = 301 - 0
192.168.2.121:56004 -> 1.0.0.1:80 = 301 - 0
192.168.2.121:56018 -> 1.0.0.1:80 = 301 - 0
192.168.2.121:40726 -> 1.1.1.1:80 = 301 - 0
192.168.2.121:40736 -> 1.1.1.1:80 = 301 - 0
192.168.2.121:40740 -> 1.1.1.1:80 = 301 - 0
192.168.2.121:56034 -> 1.0.0.1:80 = 301 - 0
192.168.2.121:40754 -> 1.1.1.1:80 = 301 - 0
192.168.2.121:40762 -> 1.1.1.1:80 = 301 - 0
192.168.2.121:40764 -> 1.1.1.1:80 = 301 - 0
192.168.2.121:56048 -> 1.0.0.1:80 = 301 - 0
192.168.2.121:40776 -> 1.1.1.1:80 = 301 - 0
192.168.2.121:56052 -> 1.0.0.1:80 = 301 - 0
192.168.2.121:56058 -> 1.0.0.1:80 = 301 - 0
192.168.2.121:56068 -> 1.0.0.1:80 = 301 - 0
192.168.2.121:56076 -> 1.0.0.1:80 = 301 - 0
192.168.2.121:40792 -> 1.1.1.1:80 = 301 - 0
192.168.2.121:40802 -> 1.1.1.1:80 = 301 - 0
192.168.2.121:40808 -> 1.1.1.1:80 = 301 - 0
192.168.2.121:56080 -> 1.0.0.1:80 = 301 - 0
192.168.2.121:40820 -> 1.1.1.1:80 = 301 - 0
192.168.2.121:40808 -> 1.1.1.1:80 = 301 - 0
192.168.2.121:56080 -> 1.0.0.1:80 = 301 - 0
192.168.2.121:40820 -> 1.1.1.1:80 = 301 - 0
192.168.2.121:56094 -> 1.0.0.1:80 = 301 - 0
192.168.2.121:40826 -> 1.1.1.1:80 = 301 - 0
192.168.2.121:56108 -> 1.0.0.1:80 = 301 - 0
192.168.2.121:40838 -> 1.1.1.1:80 = 301 - 0
192.168.2.121:40846 -> 1.1.1.1:80 = 301 - 0
192.168.2.121:56122 -> 1.0.0.1:80 = 301 - 0
192.168.2.121:40852 -> 1.1.1.1:80 = 301 - 0
192.168.2.121:40860 -> 1.1.1.1:80 = 301 - 0
192.168.2.121:56138 -> 1.0.0.1:80 = 301 - 0
192.168.2.121:56154 -> 1.0.0.1:80 = 301 - 0
192.168.2.121:40862 -> 1.1.1.1:80 = 301 - 0
192.168.2.121:56158 -> 1.0.0.1:80 = 301 - 0
192.168.2.121:56170 -> 1.0.0.1:80 = 301 - 0

Same failures on a ubuntu capz cluster so this is not related to the flatcar change

📋 Test Report
❌ 4/29 tests failed (6/230 actions), 2 tests skipped, 1 scenarios skipped:
Test [to-entities-world]:
  ❌ to-entities-world/pod-to-world/http-to-one-one-one-one-0: cilium-test/client-755fb678bd-wpkfj (192.168.2.51) -> one-one-one-one-http (one.one.one.one:80)
  ❌ to-entities-world/pod-to-world/http-to-one-one-one-one-1: cilium-test/client2-5b97d7bc66-xq6x9 (192.168.2.65) -> one-one-one-one-http (one.one.one.one:80)
Test [client-egress-l7]:
  ❌ client-egress-l7/pod-to-world/http-to-one-one-one-one-1: cilium-test/client2-5b97d7bc66-xq6x9 (192.168.2.65) -> one-one-one-one-http (one.one.one.one:80)
Test [client-egress-l7-named-port]:
  ❌ client-egress-l7-named-port/pod-to-world/http-to-one-one-one-one-1: cilium-test/client2-5b97d7bc66-xq6x9 (192.168.2.65) -> one-one-one-one-http (one.one.one.one:80)
Test [to-fqdns]:
  ❌ to-fqdns/pod-to-world/http-to-one-one-one-one-0: cilium-test/client-755fb678bd-wpkfj (192.168.2.51) -> one-one-one-one-http (one.one.one.one:80)
  ❌ to-fqdns/pod-to-world/http-to-one-one-one-one-1: cilium-test/client2-5b97d7bc66-xq6x9 (192.168.2.65) -> one-one-one-one-http (one.one.one.one:80)
connectivity test failed: 4 tests failed

I will create a follow up issue to investigate this

primeroz commented 1 year ago

Control Plane nodes review

primeroz commented 1 year ago

To build images from the flatcar offer in azure i had to accept the following license

License: Flatcar Container Linux is a 100% open source product and licensed under the applicable licenses of its constituent components, as described here: https://kinvolk.io/legal/open-source/
Warranty: Kinvolk provides this software "as is", without warranty or support of any kind. Support subscriptions are available separately from Kinvolk - please contact us for information at https://www.kinvolk.io/contact-us

by running

primeroz commented 1 year ago

TODO

Gallery ToDos

Images and Pipelines ToDos

primeroz commented 1 year ago

After lots of trial and error i think i got the right spec to use our images

      image:
        computeGallery:
          gallery: gsCAPITest1-5cb24dcf-a2d0-4aba-820f-b52ca78f96e6
          name: capi-flatcar-stable-1.24.9-gen2
          plan:
            offer: flatcar-container-linux-free
            publisher: kinvolk
            sku: stable-gen2
          version: latest

BUT , since azure keeps a link between our build images and the parent flatcar one we are getting this error

capz-controller-manager-68c6664879-lmzfc manager I0227 15:45:36.440634       1 recorder.go:103] events "msg"="Warning"  "message"="failed to reconcile AzureMachine: failed to reconcile AzureMachine service virtualmachine: failed to create resource fctest1/fctest1-control-plane-cdd30d8e-lq5wk (service: virtualmachine): compute.VirtualMachinesClient#CreateOrUpdate: Failure sending request: StatusCode=400 -- Original Error: Code=\"ResourcePurchaseValidationFailed\" Message=\"User failed validation to purchase resources. Error message: 'You have not accepted the legal terms on this subscription: '6b1f6e4a-6d0e-4aa4-9a5a-fbaca65a23b3' for this plan. Before the subscription can be used, you need to accept the legal terms of the image. To read and accept legal terms, use the Azure CLI commands described at https://go.microsoft.com/fwlink/?linkid=2110637 or the PowerShell commands available at https://go.microsoft.com/fwlink/?linkid=862451. Alternatively, deploying via the Azure portal provides a UI experience for reading and accepting the legal terms. Offer details: publisher='kinvolk' offer = 'flatcar-container-linux-free', sku = 'stable-gen2', Correlation Id: '0b436d96-21c6-4e41-9ed9-daac49507cde'.'\"" "object"={"kind":"AzureMachine","namespace":"org-multi-project","name":"fctest1-control-plane-cdd30d8e-lq5wk","uid":"ae16afdd-7c1f-430d-89d1-37540c38f074","apiVersion":"infrastructure.cluster.x-k8s.io/v1beta1","resourceVersion":"6315646"} "reason"="ReconcileError"

I will accept the terms in ghost subscription but this means we will need every customer to also do that in every subscription where we want to use those images.

I can't explain how we are using the flatcar4capi images without having accepted the same terms ... ?

From Upstream

Hello. Images in flatcar4capi are build from Flatcar VHDs imported into a SIG, so their advantage is that they don't require plan information. That's the big part of it.

sample script used by upstream to build image - https://gist.github.com/primeroz/702e6bec5fcee2986adbefeb633bffb4

primeroz commented 1 year ago

The fact that only latest is available from an image-definition is not true apparently

➜ kubectl get azuremachinetemplate fctest1-control-plane-9e46fb4a -o yaml | yq .spec.template.spec.image
computeGallery:
  gallery: gsCAPITest1-5cb24dcf-a2d0-4aba-820f-b52ca78f96e6
  name: capi-flatcar-stable-1.24.10-gen2
  version: 3374.2.3

➜ kubectl get azuremachinetemplate fctest1-md00-4e69b84e-2 -o yaml | yq .spec.template.spec.image       
computeGallery:
  gallery: gsCAPITest1-5cb24dcf-a2d0-4aba-820f-b52ca78f96e6
  name: capi-flatcar-stable-1.24.10-gen2
  version: latest
➜ kubectl --kubeconfig /dev/shm/fctest1.kubeconfig get node -o wide      
NAME                                   STATUS   ROLES           AGE     VERSION    INTERNAL-IP   EXTERNAL-IP   OS-IMAGE                                             KERNEL-VERSION    CONTAINER-RUNTIME
fctest1-control-plane-9e46fb4a-8zrtk   Ready    control-plane   14m     v1.24.10   10.0.0.4      <none>        Flatcar Container Linux by Kinvolk 3374.2.3 (Oklo)   5.15.86-flatcar   containerd://1.6.15
fctest1-md00-4e69b84e-2-68tt6          Ready    <none>          2m26s   v1.24.10   10.0.16.6     <none>        Flatcar Container Linux by Kinvolk 3374.2.4 (Oklo)   5.15.89-flatcar   containerd://1.6.15
fctest1-md00-4e69b84e-2-j9g97          Ready    <none>          6m3s    v1.24.10   10.0.16.7     <none>        Flatcar Container Linux by Kinvolk 3374.2.4 (Oklo)   5.15.89-flatcar   containerd://1.6.15
fctest1-md00-4e69b84e-zcg6l            Ready    <none>          10m     v1.24.10   10.0.16.5     <none>        Flatcar Container Linux by Kinvolk 3374.2.3 (Oklo)   5.15.86-flatcar   containerd://1.6.15
Rotfuks commented 1 year ago

We can use the following information for the legal statement in the Azure Image Gallery:

Community gallery prefix: giantswarm-
Publisher support email: dev@giantswarm.io
Publisher URL: giantswarm.io
Legal agreement URL: https://www.giantswarm.io/privacy-policy
primeroz commented 1 year ago

since last upgrade i noticed something strange

build-capz-image-1.24.11-6xb7532313faaf96cac2bcaa780286a09f-pod step-build-image ==> azure-arm.sig-{{user `build_name`}}: + [[ flatcar-gen2 != \f\l\a\t\c\a\r* ]]                                                                      
build-capz-image-1.24.11-6xb7532313faaf96cac2bcaa780286a09f-pod step-build-image ==> azure-arm.sig-{{user `build_name`}}: + sudo bash -c '/usr/share/oem/python/bin/python /usr/share/oem/bin/waagent -force -deprovision+user && sync'        

the name is azure-arm.sig-{{user \build_name`}}- why is build name not rendering ? is the actualbuild_name` working in the rest of the ansible run ?

. /home/imagebuilder/packer/azure/scripts/init-sig.sh flatcar-gen2 && packer build -var-file="/home/imagebuilder/packer/config/kubernetes.json"  -var-file="/home/imagebuilder/packer/config/cni.json"  -var-file="/home/imagebuilder/packer/config/containerd.json"  -var-file="/home/imagebuilder/packer/config/wasm-shims.json"  -var-file="/home/imagebuilder/packer/config/ansible-args.json"  -var-file="/home/imagebuilder/packer/config/goss-args.json"  -var-file="/home/imagebuilder/packer/config/common.json"  -var-file="/home/imagebuilder/packer/config/additional_components.json"  -color=true -var-file="/home/imagebuilder/packer/azure/azure-config.json" -var-file="/home/imagebuilder/packer/azure/azure-sig-gen2.json" -var-file="/home/imagebuilder/packer/azure/flatcar-gen2.json" -only="sig-flatcar-gen2" -var-file="/workspace/vars/vars.json"  packer/azure/packer.json

Executing Ansible: ansible-playbook -e packer_build_name="sig-flatcar-gen2"

UPDATE:

Everything is ok , the printing of the name was added in 1.8.6 and is buggy, already fixed in 1.8.7 https://github.com/hashicorp/packer/issues/12281

I checked the whole provisioning and is working as expected , all the flatcar bits are properly run

primeroz commented 1 year ago

Review of Hardening and other tuning

protect-kernel-defaults

Outcome: Enable

ARP Settings

net.ipv4.conf.all.rp_filter = 1
net.ipv4.conf.all.arp_ignore = 1
net.ipv4.conf.all.arp_announce = 2

there is no history or reference i could find on why we are setting those values, i will try to reach to phoenix

Outcome: TBD

local ports reserved

# Reserved to avoid conflicts with kube-apiserver, which allocates within this range
net.ipv4.ip_local_reserved_ports=30000-32767

Not sure what this conflict is and can't find an history for it, i will try to reach to phoenix

Outcome: TBD

maxmap

# Increased mmapfs because some applications, like ES, need higher limit to store data properly
vm.max_map_count = 262144

Self Explanatory

Outcome: Add to worker node pools

ipv6

net.ipv6.conf.all.accept_redirects = 0
net.ipv6.conf.default.accept_redirects = 0

since we do not disable ipv6 ( capi sets net.ipv6.conf.all.disable_ipv6 to 0 ) then we should set those

Outcome: add unless we want to disable ipv6 ?

ipv4

net.ipv4.conf.all.log_martians = 1
net.ipv4.conf.all.send_redirects = 0
net.ipv4.conf.default.accept_redirects = 0
net.ipv4.conf.default.log_martians = 1
net.ipv4.tcp_timestamps = 0

they are all reasonable

Outcome: add

inotify

fs.inotify.max_user_watches = 16384
# Default is 128, doubling for nodes with many pods
# See https://github.com/giantswarm/giantswarm/issues/7711
fs.inotify.max_user_instances = 8192

reasonable

Outcome: add

kernel settings

kernel.kptr_restrict = 2
kernel.sysrq = 0

They both seem reasoable to me

Outcome: add

primeroz commented 1 year ago

comparing containerd config.toml

primeroz commented 1 year ago

Reservations

in vintage we do

on master nodes

kubeReserved:
  cpu: 350m
  memory: 1280Mi
  ephemeral-storage: 1024Mi
kubeReservedCgroup: /kubereserved.slice
protectKernelDefaults: true
systemReserved:
  cpu: 250m
  memory: 384Mi
systemReservedCgroup: /system.slice

on worker nodes

kubeReserved:
  cpu: 250m
  memory: 768Mi
  ephemeral-storage: 1024Mi
kubeReservedCgroup: /kubereserved.slice
protectKernelDefaults: true
systemReserved:
  cpu: 250m
  memory: 384Mi
systemReservedCgroup: /system.slice

on CAPZ we

I wlil

primeroz commented 1 year ago

Upgrading from ubuntu 0.13 to flatcar currently fails with

 reason: 'Upgrade "fctest2" failed: cannot patch "fctest2" with kind KubeadmControlPlane:
    admission webhook "validation.kubeadmcontrolplane.controlplane.cluster.x-k8s.io"
    denied the request: KubeadmControlPlane.controlplane.cluster.x-k8s.io "fctest2"
    is invalid: [spec.kubeadmConfigSpec.format: Forbidden: cannot be modified, spec.kubeadmConfigSpec.mounts:
    Forbidden: cannot be modified]'
primeroz commented 1 year ago

Chaing the control-plane name and object does not seem to work

during rollout it gets stuck with

org-multi-project  ├─KubeadmControlPlane/fctest2                                False  Deleting                                             45m  
org-multi-project  │ ├─Machine/fctest2-95sxz                                    True                                                        41m  
org-multi-project  │ │ ├─AzureMachine/fctest2-control-plane-c17c01d5-gd6m4      True                                                        41m  
org-multi-project  │ │ └─KubeadmConfig/fctest2-7ps9h                            True                                                        41m  
org-multi-project  │ │   └─Secret/fctest2-7ps9h                                 -                                                           40m  
org-multi-project  │ ├─Machine/fctest2-d8xxd                                    True                                                        44m  
org-multi-project  │ │ ├─AzureMachine/fctest2-control-plane-c17c01d5-qxwn5      True                                                        44m  
org-multi-project  │ │ └─KubeadmConfig/fctest2-zz6vq                            True                                                        44m  
org-multi-project  │ │   └─Secret/fctest2-zz6vq                                 -                                                           44m  
org-multi-project  │ ├─Machine/fctest2-hlh7h                                    True                                                        38m  
org-multi-project  │ │ ├─AzureMachine/fctest2-control-plane-c17c01d5-brfwp      True                                                        38m  
org-multi-project  │ │ └─KubeadmConfig/fctest2-8dzwh                            True                                                        38m  
org-multi-project  │ │   └─Secret/fctest2-8dzwh                                 -                                                           38m  
org-multi-project  │ └─Secret/fctest2-kubeconfig                                -                                                           44m  
org-multi-project  ├─KubeadmControlPlane/fctest2-changed                        False  ScalingUp                                            8m10s
org-multi-project  │ ├─Secret/fctest2-ca                                        -                                                           44m  
org-multi-project  │ ├─Secret/fctest2-etcd                                      -                                                           44m  
org-multi-project  │ ├─Secret/fctest2-proxy                                     -                                                           44m  
org-multi-project  │ └─Secret/fctest2-sa                                        -                                                           44m  
Cluster/fctest2                                                                  False  Warning   ScalingUp  8m25s  Scaling up control plane to 3 replicas (actual 0)                         
├─ClusterInfrastructure - AzureCluster/fctest2                                   True                        44m                                                               
├─ControlPlane - KubeadmControlPlane/fctest2-changed                             False  Warning   ScalingUp  8m25s  Scaling up control plane to 3 replicas (actual 0)                      
│ ├─Machine/fctest2-95sxz                                                        True                        39m                                          
│ │ ├─BootstrapConfig - KubeadmConfig/fctest2-7ps9h                              True                        41m                         
│ │ └─MachineInfrastructure - AzureMachine/fctest2-control-plane-c17c01d5-gd6m4  True                        39m                                                             
│ ├─Machine/fctest2-d8xxd                                                        True                        42m                         

│ │ ├─BootstrapConfig - KubeadmConfig/fctest2-zz6vq                              True                        44m                         
│ │ └─MachineInfrastructure - AzureMachine/fctest2-control-plane-c17c01d5-qxwn5  True                        42m                                                                                   
│ └─Machine/fctest2-hlh7h                                                        True                        37m                                                                                                   
│   ├─BootstrapConfig - KubeadmConfig/fctest2-8dzwh                              True                        39m                                                         
│   └─MachineInfrastructure - AzureMachine/fctest2-control-plane-c17c01d5-brfwp  True                        37m       

I will reach out upstrema to see what they think since most fields can be modified and i can't see why those 2 cannot ( https://github.com/kubernetes-sigs/cluster-api/blob/main/controlplane/kubeadm/api/v1beta1/kubeadm_control_plane_webhook.go#L137 ) but right now we cna't update the CP from ubuntu to flatcar

primeroz commented 1 year ago

glippy is now converted to flatcar

➜ k get node -o wide
NAME                                  STATUS   ROLES           AGE     VERSION    INTERNAL-IP    EXTERNAL-IP   OS-IMAGE                                             KERNEL-VERSION    CONTAINER-RUNTIME
glippy-control-plane-aae7f116-jqtcd   Ready    control-plane   23m     v1.24.11   10.223.0.132   <none>        Flatcar Container Linux by Kinvolk 3374.2.4 (Oklo)   5.15.89-flatcar   containerd://1.6.18
glippy-control-plane-aae7f116-vpk8s   Ready    control-plane   30m     v1.24.11   10.223.0.137   <none>        Flatcar Container Linux by Kinvolk 3374.2.4 (Oklo)   5.15.89-flatcar   containerd://1.6.18
glippy-control-plane-aae7f116-wclks   Ready    control-plane   16m     v1.24.11   10.223.0.133   <none>        Flatcar Container Linux by Kinvolk 3374.2.4 (Oklo)   5.15.89-flatcar   containerd://1.6.18
glippy-md00-e6ebd75a-9br9p            Ready    <none>          21m     v1.24.11   10.223.0.4     <none>        Flatcar Container Linux by Kinvolk 3374.2.4 (Oklo)   5.15.89-flatcar   containerd://1.6.18
glippy-md00-e6ebd75a-fvjtj            Ready    <none>          31m     v1.24.11   10.223.0.10    <none>        Flatcar Container Linux by Kinvolk 3374.2.4 (Oklo)   5.15.89-flatcar   containerd://1.6.18
glippy-md00-e6ebd75a-lt6zc            Ready    <none>          15m     v1.24.11   10.223.0.7     <none>        Flatcar Container Linux by Kinvolk 3374.2.4 (Oklo)   5.15.89-flatcar   containerd://1.6.18
glippy-md00-e6ebd75a-q28jz            Ready    <none>          4m37s   v1.24.11   10.223.0.8     <none>        Flatcar Container Linux by Kinvolk 3374.2.4 (Oklo)   5.15.89-flatcar   containerd://1.6.18
glippy-md00-e6ebd75a-vbrzq            Ready    <none>          25m     v1.24.11   10.223.0.9     <none>        Flatcar Container Linux by Kinvolk 3374.2.4 (Oklo)   5.15.89-flatcar   containerd://1.6.18
glippy-md00-e6ebd75a-xnnq7            Ready    <none>          9m22s   v1.24.11   10.223.0.6     <none>        Flatcar Container Linux by Kinvolk 3374.2.4 (Oklo)   5.15.89-flatcar   containerd://1.6.18
primeroz commented 1 year ago

this is now done