kubernetes / kubeadm

Aggregator for issues filed against kubeadm
Apache License 2.0
3.75k stars 716 forks source link

upgrade-1-28-latest and upgrade-addons-before-controlplane-1-28-latest failed #2927

Closed pacoxu closed 1 year ago

pacoxu commented 1 year ago

https://testgrid.k8s.io/sig-cluster-lifecycle-kubeadm#kubeadm-kinder-upgrade-1-28-latest https://testgrid.k8s.io/sig-cluster-lifecycle-kubeadm#kubeadm-kinder-upgrade-addons-before-controlplane-1-28-latest

keeps failing after https://github.com/kubernetes/release/pull/3254.

/assign

ref https://github.com/kubernetes/kubeadm/issues/2925

See https://github.com/kubernetes/kubeadm/issues/2927#issuecomment-1713870411 for the conclusion.

SataQiu commented 1 year ago

We can verify and cherry-pick patches to 1.28.

SataQiu commented 1 year ago

The latest diff result(https://storage.googleapis.com/kubernetes-jenkins/logs/ci-kubernetes-e2e-kubeadm-kinder-upgrade-1-28-latest/1701751770478284800/build-log.txt) shows that the default values are no longer injected, this is what we expect.

I0913 00:25:24.641339    3894 staticpods.go:225] Pod manifest files diff:
@@ -46 +45,0 @@
-      successThreshold: 1
@@ -62 +60,0 @@
-      successThreshold: 1
@@ -64,2 +61,0 @@
-    terminationMessagePath: /dev/termination-log
-    terminationMessagePolicy: File
@@ -71,2 +66,0 @@
-  dnsPolicy: ClusterFirst
-  enableServiceLinks: true
@@ -76,2 +69,0 @@
-  restartPolicy: Always
-  schedulerName: default-scheduler
@@ -81 +72,0 @@
-  terminationGracePeriodSeconds: 30

Now we should fix the v1.28 version, and then the CI will be green. Waiting https://github.com/kubernetes/kubernetes/pull/120605 to be merged.

I think after we remove the reference of k8s.io/kubernetes/pkg/apis/core/v1 from the v1.28 branch, everything will be back to normal. Because the Pod defaulter is registered into Scheme by: https://github.com/kubernetes/kubernetes/blob/160fe010f32fd1896917fecad680769ad0e40ca0/pkg/apis/core/v1/register.go#L29-L34

func init() {
        // We only register manually written functions here. The registration of the
        // generated functions takes place in the generated files. The separation
        // makes the code compile even when the generated files are missing.
        localSchemeBuilder.Register(addDefaultingFuncs, addConversionFuncs)
}

That's why the default values are injected...

neolit123 commented 1 year ago

https://github.com/kubernetes/kubernetes/pull/120605 is open. i will ping some folks on slack to try to merge it faster.

EDIT: https://kubernetes.slack.com/archives/CJH2GBF7Y/p1694575171227399

neolit123 commented 1 year ago

but if currently, the 1.28 kubeadm binary is producing a different manifest during upgrade due to the internal defaulters, that would mean the 1.27->1.28 upgrade must be failing as well? instead, it's currently green.

edit: etcd version is the same: https://github.com/kubernetes/kubernetes/blob/d8e9fb8b7f244536325100e332faefbae01cfd7b/cmd/kubeadm/app/constants/constants.go#L462

edit2: but an etcd upgrade is performed/ successful:

[upgrade/staticpods] Component "etcd" upgraded successfully!

https://storage.googleapis.com/kubernetes-jenkins/logs/ci-kubernetes-e2e-kubeadm-kinder-upgrade-1-27-1-28/1701746737351233536/build-log.txt

chendave commented 1 year ago

some testing from me confirms this is really go version related. defaults will always generated with golang 1.20, so that whatever the k8s.io/kubernetes/pkg/apis/core/v1 is imported or not, diff will always empty as both of them are generated with defaults. defaults will not be generated by golang 1.21 by default, and I suspect kind update the golang recently, so this issue is hit.

shows that the default values are no longer injected, this is what we expect.

some testing in kinder shows the new manifest will not have the defaults generated, but old manifest for each of the pod has the defaults created.

neolit123 commented 1 year ago

but if currently, the 1.28 kubeadm binary is producing a different manifest during upgrade due to the internal defaulters, that would mean the 1.27->1.28 upgrade must be failing as well? instead, it's currently green.

just tested with kinder this workflow and there is a diff:

--- etcd.27.yaml    2023-09-13 18:03:01.839883253 +0300
+++ etcd.28.yaml    2023-09-13 17:55:56.203458105 +0300
@@ -32,7 +32,7 @@
     - --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
     - --snapshot-count=10000
     - --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
-    image: registry.k8s.io/etcd:3.5.7-0
+    image: registry.k8s.io/etcd:3.5.9-0
     imagePullPolicy: IfNotPresent
     livenessProbe:
       failureThreshold: 8

at the time of testing kubeadm 1.27 still has the 3.5.7 etcd: https://github.com/kubernetes/kubernetes/blob/release-1.27/cmd/kubeadm/app/constants/constants.go#L486

there is a pending backport for 3.5.9: https://github.com/kubernetes/kubernetes/pull/118079

but both the .27 and .28 etcd manifests do not have the defaults!

~/go/src/k8s.io/kubeadm/kinder$ cat etcd.28.yaml 
apiVersion: v1
kind: Pod
metadata:
  annotations:
    kubeadm.kubernetes.io/etcd.advertise-client-urls: https://172.17.0.2:2379
  creationTimestamp: null
  labels:
    component: etcd
    tier: control-plane
  name: etcd
  namespace: kube-system
spec:
  containers:
  - command:
    - etcd
    - --advertise-client-urls=https://172.17.0.2:2379
    - --cert-file=/etc/kubernetes/pki/etcd/server.crt
    - --client-cert-auth=true
    - --data-dir=/var/lib/etcd
    - --experimental-initial-corrupt-check=true
    - --experimental-watch-progress-notify-interval=5s
    - --initial-advertise-peer-urls=https://172.17.0.2:2380
    - --initial-cluster=kinder-upgrade-control-plane-1=https://172.17.0.2:2380
    - --key-file=/etc/kubernetes/pki/etcd/server.key
    - --listen-client-urls=https://127.0.0.1:2379,https://172.17.0.2:2379
    - --listen-metrics-urls=http://127.0.0.1:2381
    - --listen-peer-urls=https://172.17.0.2:2380
    - --name=kinder-upgrade-control-plane-1
    - --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt
    - --peer-client-cert-auth=true
    - --peer-key-file=/etc/kubernetes/pki/etcd/peer.key
    - --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
    - --snapshot-count=10000
    - --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
    image: registry.k8s.io/etcd:3.5.9-0
    imagePullPolicy: IfNotPresent
    livenessProbe:
      failureThreshold: 8
      httpGet:
        host: 127.0.0.1
        path: /health?exclude=NOSPACE&serializable=true
        port: 2381
        scheme: HTTP
      initialDelaySeconds: 10
      periodSeconds: 10
      timeoutSeconds: 15
    name: etcd
    resources:
      requests:
        cpu: 100m
        memory: 100Mi
    startupProbe:
      failureThreshold: 24
      httpGet:
        host: 127.0.0.1
        path: /health?serializable=false
        port: 2381
        scheme: HTTP
      initialDelaySeconds: 10
      periodSeconds: 10
      timeoutSeconds: 15
    volumeMounts:
    - mountPath: /var/lib/etcd
      name: etcd-data
    - mountPath: /etc/kubernetes/pki/etcd
      name: etcd-certs
  hostNetwork: true
  priority: 2000001000
  priorityClassName: system-node-critical
  securityContext:
    seccompProfile:
      type: RuntimeDefault
  volumes:
  - hostPath:
      path: /etc/kubernetes/pki/etcd
      type: DirectoryOrCreate
    name: etcd-certs
  - hostPath:
      path: /var/lib/etcd
      type: DirectoryOrCreate
    name: etcd-data
status: {}

unclear to me if https://github.com/kubernetes/kubernetes/pull/120605 will fix anything.

neolit123 commented 1 year ago

some testing from me confirms this is really go version related. defaults will always generated with golang 1.20, so that whatever the k8s.io/kubernetes/pkg/apis/core/v1 is imported or not, diff will always empty as both of them are generated with defaults. defaults will not be generated by golang 1.21 by default, and I suspect kind update the golang recently, so this issue is hit.

shows that the default values are no longer injected, this is what we expect.

some testing in kinder shows the new manifest will not have the defaults generated, but old manifest for each of the pod has the defaults created.

i saw go version diff as well at some point. go 1.21 was OK, go 1.20 generated defaults for the TestFunc in this ticket.

but kubeadm at the 1.28 branch is still built with 1.20:

$ docker exec kinder-regular-control-plane-1 kinder/upgrade/v1.28.1-59+d8e9fb8b7f2445/kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"28+", GitVersion:"v1.28.1-59+d8e9fb8b7f2445", GitCommit:"d8e9fb8b7f244536325100e332faefbae01cfd7b", GitTreeState:"clean", BuildDate:"2023-09-08T19:46:29Z", GoVersion:"go1.20.8", Compiler:"gc", Platform:"linux/amd64"}

and it's not generating defaults as i showed above. so somehow i think the bugs are still at k/k master for: https://testgrid.k8s.io/sig-cluster-lifecycle-kubeadm#kubeadm-kinder-upgrade-1-28-latest (i.e. latest)

pacoxu commented 1 year ago

https://github.com/kubernetes/release/issues/3076 is in progress.

SataQiu commented 1 year ago

The default values will only be injected when kubeadm init is performed. This is why the v1.27->1.28 upgrade works but v1.28 -> latest upgrade fails.

neolit123 commented 1 year ago

The default values are injected only when kubeadm init is performed. This is why the v1.27->1.28 upgrade works but v1.28 -> latest upgrade fails.

hmm, how so? aren't both init and upgrade using the same "create manifest" logic?

pacoxu commented 1 year ago

The default values will be injected when v1.28 kubeadm init is performed.

in Kinder? IIRC, the default will be not injected when I init my cluster on Centos directly.

SataQiu commented 1 year ago

in Kinder? IIRC, the default will be not injected when I init my cluster on Centos directly.

You can try the old v1.28 version:

wget https://storage.googleapis.com/k8s-release-dev/ci/v1.28.2-1+a68748c7cd04f2/bin/linux/amd64/kubeadm
chmod +x kubeadm
kubeadm init ...

BTW, https://github.com/kubernetes/kubernetes/pull/120605 is merged.

pacoxu commented 1 year ago

https://testgrid.k8s.io/sig-release-master-informing#kubeadm-kinder-upgrade-addons-before-controlplane-1-28-latest is green now

pacoxu commented 1 year ago

/close

https://prow.k8s.io/view/gs/kubernetes-jenkins/logs/ci-kubernetes-e2e-kubeadm-kinder-upgrade-1-28-latest/1702296617098416128 failed for a recent flake

k8s-ci-robot commented 1 year ago

@pacoxu: Closing this issue.

In response to [this](https://github.com/kubernetes/kubeadm/issues/2927#issuecomment-1719430202): >/close > >https://prow.k8s.io/view/gs/kubernetes-jenkins/logs/ci-kubernetes-e2e-kubeadm-kinder-upgrade-1-28-latest/1702296617098416128 >failed for a recent flake Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
neolit123 commented 1 year ago

i don't see defaults after @chendave 's cherry pick merged: https://github.com/kubernetes/kubernetes/commit/728862ded5e9d8fc3db1555499a66c5569ad8db6

it's 728862ded5e9d8, found in the kinder output below:

``` $ kinder build node-image-variant --base-image=kindest/base:v20221102-76f15095 --image=kindest/node:test --with-init-artifacts=v1.28.2-7+728862ded5e9d8 --loglevel=debug && kinder create cluster --name=kinder-regular --image=kindest/node:test --control-plane-nodes=1 --loglevel=debug && kinder do kubeadm-init --name=kinder-regular --loglevel=debug --kubeadm-verbosity=6 $ docker exec kinder-regular-control-plane-1 cat /etc/kubernetes/manifests/etcd.yaml apiVersion: v1 kind: Pod metadata: annotations: kubeadm.kubernetes.io/etcd.advertise-client-urls: https://172.17.0.2:2379 creationTimestamp: null labels: component: etcd tier: control-plane name: etcd namespace: kube-system spec: containers: - command: - etcd - --advertise-client-urls=https://172.17.0.2:2379 - --cert-file=/etc/kubernetes/pki/etcd/server.crt - --client-cert-auth=true - --data-dir=/var/lib/etcd - --experimental-initial-corrupt-check=true - --experimental-watch-progress-notify-interval=5s - --initial-advertise-peer-urls=https://172.17.0.2:2380 - --initial-cluster=kinder-regular-control-plane-1=https://172.17.0.2:2380 - --key-file=/etc/kubernetes/pki/etcd/server.key - --listen-client-urls=https://127.0.0.1:2379,https://172.17.0.2:2379 - --listen-metrics-urls=http://127.0.0.1:2381 - --listen-peer-urls=https://172.17.0.2:2380 - --name=kinder-regular-control-plane-1 - --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt - --peer-client-cert-auth=true - --peer-key-file=/etc/kubernetes/pki/etcd/peer.key - --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt - --snapshot-count=10000 - --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt image: registry.k8s.io/etcd:3.5.9-0 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 8 httpGet: host: 127.0.0.1 path: /health?exclude=NOSPACE&serializable=true port: 2381 scheme: HTTP initialDelaySeconds: 10 periodSeconds: 10 timeoutSeconds: 15 name: etcd resources: requests: cpu: 100m memory: 100Mi startupProbe: failureThreshold: 24 httpGet: host: 127.0.0.1 path: /health?serializable=false port: 2381 scheme: HTTP initialDelaySeconds: 10 periodSeconds: 10 timeoutSeconds: 15 volumeMounts: - mountPath: /var/lib/etcd name: etcd-data - mountPath: /etc/kubernetes/pki/etcd name: etcd-certs hostNetwork: true priority: 2000001000 priorityClassName: system-node-critical securityContext: seccompProfile: type: RuntimeDefault volumes: - hostPath: path: /etc/kubernetes/pki/etcd type: DirectoryOrCreate name: etcd-certs - hostPath: path: /var/lib/etcd type: DirectoryOrCreate name: etcd-data status: {} $ docker exec kinder-regular-control-plane-1 kubeadm version kubeadm version: &version.Info{Major:"1", Minor:"28+", GitVersion:"v1.28.2-7+728862ded5e9d8", GitCommit:"728862ded5e9d8fc3db1555499a66c5569ad8db6", GitTreeState:"clean", BuildDate:"2023-09-14T09:48:17Z", GoVersion:"go1.20.8", Compiler:"gc", Platform:"linux/amd64"} ```

You can try the old v1.28 version:

but i did not see defaults with the old 1.28 binary too:

``` $ ./kubeadm version kubeadm version: &version.Info{Major:"1", Minor:"28+", GitVersion:"v1.28.2-1+a68748c7cd04f2", GitCommit:"a68748c7cd04f2462352afb05ba31f06fc799595", GitTreeState:"clean", BuildDate:"2023-09-13T09:54:55Z", GoVersion:"go1.20.8", Compiler:"gc", Platform:"linux/amd64"} $ sudo ./kubeadm init phase etcd local [etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests" $ sudo cat /etc/kubernetes/manifests/etcd.yaml apiVersion: v1 kind: Pod metadata: annotations: kubeadm.kubernetes.io/etcd.advertise-client-urls: https://10.0.2.15:2379 creationTimestamp: null labels: component: etcd tier: control-plane name: etcd namespace: kube-system spec: containers: - command: - etcd - --advertise-client-urls=https://10.0.2.15:2379 - --cert-file=/etc/kubernetes/pki/etcd/server.crt - --client-cert-auth=true - --data-dir=/var/lib/etcd - --experimental-initial-corrupt-check=true - --experimental-watch-progress-notify-interval=5s - --initial-advertise-peer-urls=https://10.0.2.15:2380 - --initial-cluster=lubo-it=https://10.0.2.15:2380 - --key-file=/etc/kubernetes/pki/etcd/server.key - --listen-client-urls=https://127.0.0.1:2379,https://10.0.2.15:2379 - --listen-metrics-urls=http://127.0.0.1:2381 - --listen-peer-urls=https://10.0.2.15:2380 - --name=lubo-it - --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt - --peer-client-cert-auth=true - --peer-key-file=/etc/kubernetes/pki/etcd/peer.key - --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt - --snapshot-count=10000 - --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt image: registry.k8s.io/etcd:3.5.9-0 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 8 httpGet: host: 127.0.0.1 path: /health?exclude=NOSPACE&serializable=true port: 2381 scheme: HTTP initialDelaySeconds: 10 periodSeconds: 10 timeoutSeconds: 15 name: etcd resources: requests: cpu: 100m memory: 100Mi startupProbe: failureThreshold: 24 httpGet: host: 127.0.0.1 path: /health?serializable=false port: 2381 scheme: HTTP initialDelaySeconds: 10 periodSeconds: 10 timeoutSeconds: 15 volumeMounts: - mountPath: /var/lib/etcd name: etcd-data - mountPath: /etc/kubernetes/pki/etcd name: etcd-certs hostNetwork: true priority: 2000001000 priorityClassName: system-node-critical securityContext: seccompProfile: type: RuntimeDefault volumes: - hostPath: path: /etc/kubernetes/pki/etcd type: DirectoryOrCreate name: etcd-certs - hostPath: path: /var/lib/etcd type: DirectoryOrCreate name: etcd-data status: {} ```
SataQiu commented 1 year ago

but i did not see defaults with the old 1.28 binary too:

@neolit123 @pacoxu You can try to reproduce it by:

docker pull kindest/base:v20221102-76f15095

kinder build node-image-variant --base-image=kindest/base:v20221102-76f15095 --image=kindest/node:test --with-init-artifacts=v1.28.2-1+a68748c7cd04f2 --with-upgrade-artifacts=v1.29.0-alpha.0.802+a68093a3ffb552 --loglevel=debug

kinder create cluster --name=kinder-upgrade --image=kindest/node:test --control-plane-nodes=1 --worker-nodes=1 --loglevel=debug
kinder do kubeadm-init --name=kinder-upgrade --copy-certs=auto --loglevel=debug --kubeadm-verbosity=6
echo "---------------------------------------------"
echo "----Old v1.28 kubeadm generated etcd.yaml----"
echo "---------------------------------------------"
docker exec kinder-upgrade-control-plane-1  cat /etc/kubernetes/manifests/etcd.yaml
kinder delete cluster --name=kinder-upgrade

kinder build node-image-variant --base-image=kindest/base:v20221102-76f15095 --image=kindest/node:test --with-init-artifacts=v1.28.2-7+728862ded5e9d8 --with-upgrade-artifacts=v1.29.0-alpha.0.806+4abf29c5c86349 --loglevel=debug

kinder create cluster --name=kinder-upgrade --image=kindest/node:test --control-plane-nodes=1 --worker-nodes=1 --loglevel=debug
kinder do kubeadm-init --name=kinder-upgrade --copy-certs=auto --loglevel=debug --kubeadm-verbosity=6
echo "---------------------------------------------"
echo "----New v1.28 kubeadm generated etcd.yaml----"
echo "---------------------------------------------"
docker exec kinder-upgrade-control-plane-1  cat /etc/kubernetes/manifests/etcd.yaml
kinder delete cluster --name=kinder-upgrade           

The output is as follows:

...
---------------------------------------------
----Old v1.28 kubeadm generated etcd.yaml----
---------------------------------------------
apiVersion: v1
kind: Pod
metadata:
  annotations:
    kubeadm.kubernetes.io/etcd.advertise-client-urls: https://172.17.0.3:2379/
  creationTimestamp: null
  labels:
    component: etcd
    tier: control-plane
  name: etcd
  namespace: kube-system
spec:
  containers:
  - command:
    - etcd
    - --advertise-client-urls=https://172.17.0.3:2379/
    - --cert-file=/etc/kubernetes/pki/etcd/server.crt
    - --client-cert-auth=true
    - --data-dir=/var/lib/etcd
    - --experimental-initial-corrupt-check=true
    - --experimental-watch-progress-notify-interval=5s
    - --initial-advertise-peer-urls=https://172.17.0.3:2380/
    - --initial-cluster=kinder-upgrade-control-plane-1=https://172.17.0.3:2380/
    - --key-file=/etc/kubernetes/pki/etcd/server.key
    - --listen-client-urls=https://127.0.0.1:2379/,https://172.17.0.3:2379/
    - --listen-metrics-urls=http://127.0.0.1:2381/
    - --listen-peer-urls=https://172.17.0.3:2380/
    - --name=kinder-upgrade-control-plane-1
    - --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt
    - --peer-client-cert-auth=true
    - --peer-key-file=/etc/kubernetes/pki/etcd/peer.key
    - --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
    - --snapshot-count=10000
    - --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
    image: registry.k8s.io/etcd:3.5.9-0
    imagePullPolicy: IfNotPresent
    livenessProbe:
      failureThreshold: 8
      httpGet:
        host: 127.0.0.1
        path: /health?exclude=NOSPACE&serializable=true
        port: 2381
        scheme: HTTP
      initialDelaySeconds: 10
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 15
    name: etcd
    resources:
      requests:
        cpu: 100m
        memory: 100Mi
    startupProbe:
      failureThreshold: 24
      httpGet:
        host: 127.0.0.1
        path: /health?serializable=false
        port: 2381
        scheme: HTTP
      initialDelaySeconds: 10
      periodSeconds: 10
      successThreshold: 1
      timeoutSeconds: 15
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /var/lib/etcd
      name: etcd-data
    - mountPath: /etc/kubernetes/pki/etcd
      name: etcd-certs
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  hostNetwork: true
  priority: 2000001000
  priorityClassName: system-node-critical
  restartPolicy: Always
  schedulerName: default-scheduler
  securityContext:
    seccompProfile:
      type: RuntimeDefault
  terminationGracePeriodSeconds: 30
  volumes:
  - hostPath:
      path: /etc/kubernetes/pki/etcd
      type: DirectoryOrCreate
    name: etcd-certs
  - hostPath:
      path: /var/lib/etcd
      type: DirectoryOrCreate
    name: etcd-data
status: {}

...

---------------------------------------------
----New v1.28 kubeadm generated etcd.yaml----
---------------------------------------------
apiVersion: v1
kind: Pod
metadata:
  annotations:
    kubeadm.kubernetes.io/etcd.advertise-client-urls: https://172.17.0.4:2379/
  creationTimestamp: null
  labels:
    component: etcd
    tier: control-plane
  name: etcd
  namespace: kube-system
spec:
  containers:
  - command:
    - etcd
    - --advertise-client-urls=https://172.17.0.4:2379/
    - --cert-file=/etc/kubernetes/pki/etcd/server.crt
    - --client-cert-auth=true
    - --data-dir=/var/lib/etcd
    - --experimental-initial-corrupt-check=true
    - --experimental-watch-progress-notify-interval=5s
    - --initial-advertise-peer-urls=https://172.17.0.4:2380/
    - --initial-cluster=kinder-upgrade-control-plane-1=https://172.17.0.4:2380/
    - --key-file=/etc/kubernetes/pki/etcd/server.key
    - --listen-client-urls=https://127.0.0.1:2379/,https://172.17.0.4:2379/
    - --listen-metrics-urls=http://127.0.0.1:2381/
    - --listen-peer-urls=https://172.17.0.4:2380/
    - --name=kinder-upgrade-control-plane-1
    - --peer-cert-file=/etc/kubernetes/pki/etcd/peer.crt
    - --peer-client-cert-auth=true
    - --peer-key-file=/etc/kubernetes/pki/etcd/peer.key
    - --peer-trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
    - --snapshot-count=10000
    - --trusted-ca-file=/etc/kubernetes/pki/etcd/ca.crt
    image: registry.k8s.io/etcd:3.5.9-0
    imagePullPolicy: IfNotPresent
    livenessProbe:
      failureThreshold: 8
      httpGet:
        host: 127.0.0.1
        path: /health?exclude=NOSPACE&serializable=true
        port: 2381
        scheme: HTTP
      initialDelaySeconds: 10
      periodSeconds: 10
      timeoutSeconds: 15
    name: etcd
    resources:
      requests:
        cpu: 100m
        memory: 100Mi
    startupProbe:
      failureThreshold: 24
      httpGet:
        host: 127.0.0.1
        path: /health?serializable=false
        port: 2381
        scheme: HTTP
      initialDelaySeconds: 10
      periodSeconds: 10
      timeoutSeconds: 15
    volumeMounts:
    - mountPath: /var/lib/etcd
      name: etcd-data
    - mountPath: /etc/kubernetes/pki/etcd
      name: etcd-certs
  hostNetwork: true
  priority: 2000001000
  priorityClassName: system-node-critical
  securityContext:
    seccompProfile:
      type: RuntimeDefault
  volumes:
  - hostPath:
      path: /etc/kubernetes/pki/etcd
      type: DirectoryOrCreate
    name: etcd-certs
  - hostPath:
      path: /var/lib/etcd
      type: DirectoryOrCreate
    name: etcd-data
status: {}
SataQiu commented 1 year ago

Therefore, if a user initializes the cluster with old v1.28 kubeadm (<1.28.2), they may encounter problems when upgrading to v1.29. However, if we can bump the etcd version number in v1.29, this will not be a problem.

neolit123 commented 1 year ago

strange, in my test here: https://github.com/kubernetes/kubeadm/issues/2927#issuecomment-1719430589

i runed against the older v1.28.2-1+a68748c7cd04f2 and did not get defaults.

--with-init-artifacts=v1.28.2-1+a68748c7cd04f2

you are running the same "old" version but getting a different etcd.yaml.

SataQiu commented 1 year ago

@neolit123 Emm... I don't really understand.

./kubeadm init phase etcd local will NOT get defaults.

But ./kubeadm init --ignore-preflight-errors=Swap,SystemVerification,FileContent--proc-sys-net-bridge-bridge-nf-call-iptables --config=/kind/kubeadm.conf --v=6 --upload-certs will get defaults.

kubeadm version:

# ./kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"28+", GitVersion:"v1.28.2-1+a68748c7cd04f2", GitCommit:"a68748c7cd04f2462352afb05ba31f06fc799595", GitTreeState:"clean", BuildDate:"2023-09-13T09:54:55Z", GoVersion:"go1.20.8", Compiler:"gc", Platform:"linux/amd64"}
root@kinder-upgrade-control-plane-1:/# 

Perhaps related to the execution path?

neolit123 commented 1 year ago

no idea. init is technically calling the same phase.