Closed nashford77 closed 2 weeks ago
Q: How are you meant to bootstrap the worker nodes ? is there an example? guessing kubelet args are missing ... ?
/kind support
not sure I fully understand the question here
are you saying the nodes you created for worker node doesn't have the external ip (which is from floating ip) ?
Yes, root issue was that it was not bootstrapped on the worker node side correctly with "external" for the cloud provider key & was a cloud init issue on my side. All sorted now
On Tue, Jun 18, 2024, 3:06 AM ji chen @.***> wrote:
not sure I fully understand the question here
are you saying the nodes you created for worker node doesn't have the external ip (which is from floating ip) ?
— Reply to this email directly, view it on GitHub https://github.com/kubernetes/cloud-provider-openstack/issues/2617#issuecomment-2175318484, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFSSJU3Y7S3LDPW5CJHI6NTZH7L7TAVCNFSM6AAAAABJAWZ62KVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNZVGMYTQNBYGQ . You are receiving this because you authored the thread.Message ID: @.***>
ok, please close this if all done, thanks
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle stale
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/remove-lifecycle rotten
/close
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages issues according to the following rules:
lifecycle/stale
is appliedlifecycle/stale
was applied, lifecycle/rotten
is appliedlifecycle/rotten
was applied, the issue is closedYou can:
/reopen
/remove-lifecycle rotten
Please send feedback to sig-contributor-experience at kubernetes/community.
/close not-planned
@k8s-triage-robot: Closing this issue, marking it as "Not Planned".
root@5net-k8s-master-0:~# kubectl get nodes -A -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME 5net-k8s-master-0 Ready control-plane,master 23h v1.30.1 10.5.1.36 192.168.5.75 Ubuntu 22.04.4 LTS 5.15.0-112-generic docker://26.1.4 5net-k8s-node-0 Ready worker 23h v1.30.1 10.5.1.121 Ubuntu 22.04.4 LTS 5.15.0-112-generic docker://26.1.4
5net-k8s-node-1 Ready worker 23h v1.30.1 10.5.1.55 Ubuntu 22.04.4 LTS 5.15.0-112-generic docker://26.1.4
5net-k8s-node-2 Ready worker 23h v1.30.1 10.5.1.45 Ubuntu 22.04.4 LTS 5.15.0-112-generic docker://26.1.4
I saw this earlier:
root@5net-k8s-master-0:~# kubectl logs -n kube-system -l k8s-app=openstack-cloud-controller-manager I0608 09:19:59.531119 10 controllermanager.go:319] Starting "service-lb-controller" I0608 09:19:59.531235 10 node_lifecycle_controller.go:113] Sending events to api server I0608 09:19:59.531576 10 openstack.go:385] Claiming to support LoadBalancer I0608 09:19:59.531722 10 controllermanager.go:338] Started "service-lb-controller" I0608 09:19:59.531863 10 controller.go:231] Starting service controller I0608 09:19:59.531964 10 shared_informer.go:313] Waiting for caches to sync for service I0608 09:19:59.631182 10 node_controller.go:425] Initializing node 5net-k8s-master-0 with cloud provider I0608 09:19:59.632722 10 shared_informer.go:320] Caches are synced for service I0608 09:20:00.346484 10 node_controller.go:492] Successfully initialized node 5net-k8s-master-0 with cloud provider I0608 09:20:00.346746 10 event.go:389] "Event occurred" object="5net-k8s-master-0" fieldPath="" kind="Node" apiVersion="v1" type="Normal" reason="Synced" message="Node synced successfully"
I restarted it thinking this may register the other nodes, no go...
root@5net-k8s-master-0:~# kubectl logs -f -n kube-system -l k8s-app=openstack-cloud-controller-manager I0609 08:52:08.275473 10 requestheader_controller.go:169] Starting RequestHeaderAuthRequestController I0609 08:52:08.275485 10 shared_informer.go:313] Waiting for caches to sync for RequestHeaderAuthRequestController I0609 08:52:08.275567 10 tlsconfig.go:240] "Starting DynamicServingCertificateController" I0609 08:52:08.275664 10 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::client-ca-file" I0609 08:52:08.275676 10 shared_informer.go:313] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::client-ca-file I0609 08:52:08.275684 10 configmap_cafile_content.go:202] "Starting controller" name="client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file" I0609 08:52:08.275908 10 shared_informer.go:313] Waiting for caches to sync for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file I0609 08:52:08.375986 10 shared_informer.go:320] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::requestheader-client-ca-file I0609 08:52:08.376143 10 shared_informer.go:320] Caches are synced for client-ca::kube-system::extension-apiserver-authentication::client-ca-file I0609 08:52:08.375993 10 shared_informer.go:320] Caches are synced for RequestHeaderAuthRequestController I0609 08:52:23.547603 10 leaderelection.go:260] successfully acquired lease kube-system/cloud-controller-manager I0609 08:52:23.550525 10 event.go:389] "Event occurred" object="kube-system/cloud-controller-manager" fieldPath="" kind="Lease" apiVersion="coordination.k8s.io/v1" type="Normal" reason="LeaderElection" message="5net-k8s-master-0_1c08edda-634c-40aa-b580-12852f2a4bc5 became leader" I0609 08:52:23.554762 10 openstack.go:504] Setting up informers for Cloud I0609 08:52:23.555992 10 controllermanager.go:319] Starting "cloud-node-lifecycle-controller" I0609 08:52:23.558714 10 controllermanager.go:338] Started "cloud-node-lifecycle-controller" I0609 08:52:23.559834 10 controllermanager.go:319] Starting "service-lb-controller" I0609 08:52:23.561899 10 openstack.go:385] Claiming to support LoadBalancer I0609 08:52:23.562019 10 controllermanager.go:338] Started "service-lb-controller" I0609 08:52:23.562063 10 controllermanager.go:319] Starting "node-route-controller" I0609 08:52:23.564807 10 node_lifecycle_controller.go:113] Sending events to api server I0609 08:52:23.565221 10 controller.go:231] Starting service controller I0609 08:52:23.565276 10 shared_informer.go:313] Waiting for caches to sync for service W0609 08:52:23.649048 10 openstack.go:488] Error initialising Routes support: router-id not set in cloud provider config W0609 08:52:23.649189 10 core.go:111] --configure-cloud-routes is set, but cloud provider does not support routes. Will not configure cloud provider routes. W0609 08:52:23.649196 10 controllermanager.go:326] Skipping "node-route-controller" I0609 08:52:23.649203 10 controllermanager.go:319] Starting "cloud-node-controller" I0609 08:52:23.651145 10 controllermanager.go:338] Started "cloud-node-controller" I0609 08:52:23.651467 10 node_controller.go:164] Sending events to api server. I0609 08:52:23.652264 10 node_controller.go:173] Waiting for informer caches to sync I0609 08:52:23.665533 10 shared_informer.go:320] Caches are synced for service
the old versions would register this on all nodes (I have one up with it...)
(kolla-2023.2) root@slurm-primary-controller:~/ansible/5Net/k8s-bootstrap# kubectl get nodes -A -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME k8s-5net-ljtcgza6zsbt-master-0 Ready master 30d v1.23.3 10.5.1.203 192.168.5.72 Fedora CoreOS 38.20230806.3.0 6.4.7-200.fc38.x86_64 docker://20.10.23 k8s-5net-ljtcgza6zsbt-node-0 Ready worker 30d v1.23.3 10.5.1.77 192.168.5.67 Fedora CoreOS 38.20230806.3.0 6.4.7-200.fc38.x86_64 docker://20.10.23 k8s-5net-ljtcgza6zsbt-node-1 Ready worker 30d v1.23.3 10.5.1.174 192.168.5.45 Fedora CoreOS 38.20230806.3.0 6.4.7-200.fc38.x86_64 docker://20.10.23 k8s-5net-ljtcgza6zsbt-node-2 Ready worker 30d v1.23.3 10.5.1.240 192.168.5.87 Fedora CoreOS 38.20230806.3.0 6.4.7-200.fc38.x86_64 docker://20.10.23
What's missing / different in the new version?
adding diagnostic info i can think of to help.
root@5net-k8s-master-0:~# kubectl get pods -A NAMESPACE NAME READY STATUS RESTARTS AGE default diagnostic-pod 2/2 Running 0 78m kube-flannel kube-flannel-ds-2mlph 1/1 Running 0 23h kube-flannel kube-flannel-ds-5v6w6 1/1 Running 0 23h kube-flannel kube-flannel-ds-crlks 1/1 Running 0 23h kube-flannel kube-flannel-ds-pg8fb 1/1 Running 1 (24h ago) 24h kube-system coredns-5cf4f94ffd-4px6h 1/1 Running 0 31m kube-system coredns-5cf4f94ffd-j9phr 1/1 Running 0 31m kube-system dnsutils 1/1 Running 1 (12m ago) 72m kube-system etcd-5net-k8s-master-0 1/1 Running 1 (24h ago) 24h kube-system kube-apiserver-5net-k8s-master-0 1/1 Running 1 (23h ago) 24h kube-system kube-controller-manager-5net-k8s-master-0 1/1 Running 1 (24h ago) 24h kube-system kube-proxy-5b279 1/1 Running 0 23h kube-system kube-proxy-5l2cc 1/1 Running 1 (24h ago) 24h kube-system kube-proxy-bdz4v 1/1 Running 0 23h kube-system kube-proxy-lfrsz 1/1 Running 0 23h kube-system kube-scheduler-5net-k8s-master-0 1/1 Running 1 (24h ago) 24h kube-system openstack-cloud-controller-manager-crnlz 1/1 Running 0 7m15s
root@5net-k8s-master-0:~# kubectl get ds openstack-cloud-controller-manager -n kube-system NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE openstack-cloud-controller-manager 1 1 1 1 1 node-role.kubernetes.io/control-plane= 23h
Please edit the object below. Lines beginning with a '#' will be ignored,
and an empty file will abort the edit. If an error occurs while saving this file will be
reopened with the relevant failures.
# apiVersion: apps/v1 kind: DaemonSet metadata: annotations: deprecated.daemonset.template.generation: "4" kubectl.kubernetes.io/last-applied-configuration: | {"apiVersion":"apps/v1","kind":"DaemonSet","metadata":{"annotations":{},"labels":{"k8s-app":"openstack-cloud-controller-manager"},"name":"openstack-cloud-controller-manager","namespace":"kube-system"},"spec":{"selector":{"matchLabels":{"k8s-app":"openstack-cloud-controller-manager"}},"template":{"metadata":{"labels":{"k8s-app":"openstack-cloud-controller-manager"}},"spec":{"containers":[{"args":["/bin/openstack-cloud-controller-manager","--v=1","--cluster-name=$(CLUSTER_NAME)","--cloud-config=$(CLOUD_CONFIG)","--cloud-provider=openstack","--use-service-account-credentials=false","--bind-address=127.0.0.1"],"env":[{"name":"CLOUD_CONFIG","value":"/etc/config/cloud.conf"},{"name":"CLUSTER_NAME","value":"kubernetes"}],"image":"registry.k8s.io/provider-os/openstack-cloud-controller-manager:v1.30.0","name":"openstack-cloud-controller-manager","resources":{"requests":{"cpu":"200m"}},"volumeMounts":[{"mountPath":"/etc/kubernetes/pki","name":"k8s-certs","readOnly":true},{"mountPath":"/etc/ssl/certs","name":"ca-certs","readOnly":true},{"mountPath":"/etc/config","name":"cloud-config-volume","readOnly":true}]}],"dnsPolicy":"ClusterFirstWithHostNet","hostNetwork":true,"nodeSelector":{"node-role.kubernetes.io/control-plane":""},"securityContext":{"runAsUser":1001},"serviceAccountName":"cloud-controller-manager","tolerations":[{"effect":"NoSchedule","key":"node.cloudprovider.kubernetes.io/uninitialized","value":"true"},{"effect":"NoSchedule","key":"node-role.kubernetes.io/master"},{"effect":"NoSchedule","key":"node-role.kubernetes.io/control-plane"}],"volumes":[{"hostPath":{"path":"/etc/kubernetes/pki","type":"DirectoryOrCreate"},"name":"k8s-certs"},{"hostPath":{"path":"/etc/ssl/certs","type":"DirectoryOrCreate"},"name":"ca-certs"},{"name":"cloud-config-volume","secret":{"secretName":"cloud-config"}}]}},"updateStrategy":{"type":"RollingUpdate"}}} creationTimestamp: "2024-06-08T08:58:54Z" generation: 4 labels: k8s-app: openstack-cloud-controller-manager name: openstack-cloud-controller-manager namespace: kube-system resourceVersion: "182284" uid: 8006e824-3ea2-44b4-8a0b-335777a86009 spec: revisionHistoryLimit: 10 selector: matchLabels: k8s-app: openstack-cloud-controller-manager template: metadata: annotations: kubectl.kubernetes.io/restartedAt: "2024-06-09T08:52:05Z" creationTimestamp: null labels: k8s-app: openstack-cloud-controller-manager spec: containers:
Are the tolerations the issue? It should only run on master nodes, but it should still pull info for the worker nodes external IP's ?!