Closed guettli closed 1 week ago
Testing: Without constant-hostname it works. Related logs:
level: DEBUG
time: "2024-06-21T09:20:21.784Z"
file: host/state_machine.go:73
message: changing provisioning state
controller: hetznerbaremetalhost
controllerGroup: infrastructure.cluster.x-k8s.io
controllerKind: HetznerBareMetalHost
HetznerBareMetalHost:
name: test-bm-uefi
namespace: default
namespace: default
name: test-bm-uefi
reconcileID: 173abccd-a45c-4cc7-a663-18ceca646d2d
HetznerBareMetalHost:
name: test-bm-uefi
namespace: default
HetznerCluster:
name: caph-guettli
namespace: default
Cluster:
name: caph-guettli
namespace: default
HetznerBareMetalMachine:
name: caph-guettli-md-1-wvp4v-st54z
namespace: default
old: provisioning
new: ensure-provisioned
---
level: DEBUG
time: "2024-06-21T09:20:21.784Z"
logger: events
file: recorder/recorder.go:104
message: 'Phase provisioning, reboot via SSH: UserData of cloud-init was created'
type: Normal
object:
kind: HetznerBareMetalHost
namespace: default
name: test-bm-uefi
uid: dd2a648f-d375-4e93-8a50-90e54e781a6f
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
resourceVersion: "116625"
reason: RebootBMServerViaSSHProvisioningStateProvisioning
---
level: DEBUG
time: "2024-06-21T09:20:21.906Z"
file: host/host.go:156
message: conflict error. Retrying
controller: hetznerbaremetalhost
controllerGroup: infrastructure.cluster.x-k8s.io
controllerKind: HetznerBareMetalHost
HetznerBareMetalHost:
name: test-bm-uefi
namespace: default
namespace: default
name: test-bm-uefi
reconcileID: a0f653a1-24fb-4462-aba8-210e9749599a
HetznerBareMetalHost:
name: test-bm-uefi
namespace: default
HetznerCluster:
name: caph-guettli
namespace: default
Cluster:
name: caph-guettli
namespace: default
HetznerBareMetalMachine:
name: caph-guettli-md-1-wvp4v-st54z
namespace: default
err: 'Operation cannot be fulfilled on hetznerbaremetalhosts.infrastructure.cluster.x-k8s.io "test-bm-uefi": the object has been modified; please apply your changes to the latest version and try again'
---
level: DEBUG
time: "2024-06-21T09:22:27.918Z"
file: controllers/csr_controller.go:311
message: No constant baremetal hostname - regex does not match CSR username
controller: certificatesigningrequest
controllerGroup: certificates.k8s.io
controllerKind: CertificateSigningRequest
CertificateSigningRequest:
name: csr-8lvwd
namespace: ""
name: csr-8lvwd
reconcileID: a33a6a29-8d4a-477e-828c-3e9c6ec7450d
regex: ^bm-(\S*)-(\d+)$
csrUserName: system:node:bm-caph-guettli-md-1-wvp4v-st54z
---
level: DEBUG
time: "2024-06-21T09:22:27.934Z"
logger: events
file: recorder/recorder.go:104
message: approved csr
type: Normal
object:
kind: CertificateSigningRequest
name: csr-8lvwd
uid: e7d0b975-ea14-49d0-bdd0-7897f64a9fae
apiVersion: certificates.k8s.io/v1
resourceVersion: "202823"
reason: CSRApproved
---
level: DEBUG
time: "2024-06-21T09:22:34.244Z"
logger: events
file: recorder/recorder.go:104
message: server successfully provisioned
type: Normal
object:
kind: HetznerBareMetalHost
namespace: default
name: test-bm-uefi
uid: dd2a648f-d375-4e93-8a50-90e54e781a6f
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
resourceVersion: "116919"
reason: ServerProvisioned
---
level: DEBUG
time: "2024-06-21T09:22:34.819Z"
file: host/state_machine.go:73
message: changing provisioning state
controller: hetznerbaremetalhost
controllerGroup: infrastructure.cluster.x-k8s.io
controllerKind: HetznerBareMetalHost
HetznerBareMetalHost:
name: test-bm-uefi
namespace: default
namespace: default
name: test-bm-uefi
reconcileID: 96a88d5a-f57b-4a11-a4c6-0b603e46e98f
HetznerBareMetalHost:
name: test-bm-uefi
namespace: default
HetznerCluster:
name: caph-guettli
namespace: default
Cluster:
name: caph-guettli
namespace: default
HetznerBareMetalMachine:
name: caph-guettli-md-1-wvp4v-st54z
namespace: default
old: ensure-provisioned
new: provisioned
---
level: DEBUG
time: "2024-06-21T09:22:34.819Z"
logger: events
file: recorder/recorder.go:104
message: "/var/log/cloud-init-output.log: Cloud-init v. 24.1.3-0ubuntu1~22.04.5 [cut] ..."
type: Normal
object:
kind: HetznerBareMetalHost
namespace: default
name: test-bm-uefi
uid: dd2a648f-d375-4e93-8a50-90e54e781a6f
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
resourceVersion: "116919"
reason: CloudInitOutput
Here the logs with a machine with constant-hostname (I filtered the output a bit so that it has less lines):
the important part "ProviderID not set yet." is highlighted. It appears only once. So waiting 5s seems good.
level: DEBUG
time: "2024-06-21T10:03:30.288Z"
logger: events
message: 'will retry: . StdErr: (failed to dial ssh. Error message: dial tcp 37.27.97.19:22: connect: connection refused. DialErr: failed to dial ssh)'
object:
kind: HetznerBareMetalHost
name: test-bm-uefi
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
reason: SSHToRescueSystemFailed
---
level: DEBUG
time: "2024-06-21T10:03:30.293Z"
message: conflict error. Retrying
HetznerBareMetalHost:
name: test-bm-uefi
name: test-bm-uefi
HetznerBareMetalHost:
name: test-bm-uefi
HetznerCluster:
Cluster:
HetznerBareMetalMachine:
err: 'Operation cannot be fulfilled on hetznerbaremetalhosts.infrastructure.cluster.x-k8s.io "test-bm-uefi": the object has been modified; please apply your changes to the latest version and try again'
---
level: DEBUG
time: "2024-06-21T10:03:30.378Z"
message: conflict error. Retrying
HetznerBareMetalHost:
name: test-bm-uefi
name: test-bm-uefi
HetznerBareMetalHost:
name: test-bm-uefi
HetznerCluster:
Cluster:
HetznerBareMetalMachine:
err: 'Operation cannot be fulfilled on hetznerbaremetalhosts.infrastructure.cluster.x-k8s.io "test-bm-uefi": the object has been modified; please apply your changes to the latest version and try again'
---
level: DEBUG
time: "2024-06-21T10:04:51.952Z"
message: changing provisioning state
HetznerBareMetalHost:
name: test-bm-uefi
name: test-bm-uefi
HetznerBareMetalHost:
name: test-bm-uefi
HetznerCluster:
Cluster:
HetznerBareMetalMachine:
old: provisioning
new: ensure-provisioned
---
level: DEBUG
time: "2024-06-21T10:04:51.953Z"
logger: events
message: 'Phase provisioning, reboot via SSH: UserData of cloud-init was created'
object:
kind: HetznerBareMetalHost
name: test-bm-uefi
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
reason: RebootBMServerViaSSHProvisioningStateProvisioning
---
level: DEBUG
time: "2024-06-21T10:04:52.107Z"
message: conflict error. Retrying
HetznerBareMetalHost:
name: test-bm-uefi
name: test-bm-uefi
HetznerBareMetalHost:
name: test-bm-uefi
HetznerCluster:
Cluster:
HetznerBareMetalMachine:
err: 'Operation cannot be fulfilled on hetznerbaremetalhosts.infrastructure.cluster.x-k8s.io "test-bm-uefi": the object has been modified; please apply your changes to the latest version and try again'
---
*************** HERE ***********
level: INFO
time: "2024-06-21T10:06:56.046Z"
message: 'ProviderID not set yet. The hbmm seems to be in ''ensure-provision''. Retrying. ProviderID: "hcloud://bm-2317525" no HetznerBaremetalMachine by ProviderID found'
CertificateSigningRequest:
name: csr-6pxgn
name: csr-6pxgn
CertificateSigningRequest:
name: csr-6pxgn
---
level: DEBUG
time: "2024-06-21T10:06:57.856Z"
logger: events
message: server successfully provisioned
object:
kind: HetznerBareMetalHost
name: test-bm-uefi
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
reason: ServerProvisioned
---
level: DEBUG
time: "2024-06-21T10:06:58.522Z"
message: changing provisioning state
HetznerBareMetalHost:
name: test-bm-uefi
name: test-bm-uefi
HetznerBareMetalHost:
name: test-bm-uefi
HetznerCluster:
Cluster:
HetznerBareMetalMachine:
old: ensure-provisioned
new: provisioned
---
level: DEBUG
time: "2024-06-21T10:06:58.522Z"
logger: events
message: "/var/log/cloud-init-output.log: Cloud-init v. 24.1.3-0ubuntu1~22.04.5 .. .cut"
object:
kind: HetznerBareMetalHost
name: test-bm-uefi
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
reason: CloudInitOutput
---
level: INFO
time: "2024-06-21T10:07:01.047Z"
message: Found HetznerBareMetalMachine with constant hostname
CertificateSigningRequest:
name: csr-6pxgn
name: csr-6pxgn
csr-username: system:node:bm-caph-guettli-2317525
hetznerBareMetalMachine: caph-guettli-md-1-vn5zw-v7gv6
---
level: INFO
time: "2024-06-21T10:07:01.047Z"
message: found hbmm for system:node:bm-caph-guettli-2317525 (ConstantHostname)
CertificateSigningRequest:
name: csr-6pxgn
name: csr-6pxgn
---
level: DEBUG
time: "2024-06-21T10:07:01.069Z"
logger: events
message: approved csr
object:
kind: CertificateSigningRequest
name: csr-6pxgn
apiVersion: certificates.k8s.io/v1
reason: CSRApproved
What this PR does / why we need it:
In this PR https://github.com/syself/cluster-api-provider-hetzner/pull/1351 we set the ProviderID earlier, but this did not work.
The linked PR was done to avoid error messages in the CSR-controller.
This PR changes the error messages of the CSR-controller. If the CSR exists, but there is no hbmm with the corresponding ProviderID, then this is fine. We wait some seconds and then we try again.
The hbmm is likely in state "ensure-provisioned", and the providerID will be set soon. INFO level logging is fine in this case.