Currently, CAPT tells Smee to not allowing a machine to network boot after it has been provisioned. This happens by CAPT setting 2 values in a Hardware object. Hardware.Spec.Metadata.Instance.State = provisioned and Hardware.Spec.Metadata.State = in_use.
This was the case because Smee was using these fields to gate network booting. This is no longer the case in Smee (since v0.10.0). Gating of network booting in Smee now occurs via Hardware.Spec.Interfaces[].Netboot.AllowPXE.
The affect of this is that when a machine reboots, if the firmware is setup to network boot first, then the machine will be served network boot packets from Smee and the machine will boot into HookOS and sit there indefinitely.
Expected Behaviour
A machine provisioned by CAPT should not network boot after a reboot (hardware configured to tell Smee not to netboot a machine).
Current Behaviour
Possible Solution
Update CAPT to set Hardware.Spec.Interfaces[].Netboot.AllowPXE = false after a machine is provisioned.
Steps to Reproduce (for bugs)
Provision a cluster with CAPT
Set a machine's firmware to network boot
Reboot the machine
See that HookOS is loaded and sits indefinitely
Context
Your Environment
Operating System and version (e.g. Linux, Windows, MacOS):
How are you running Tinkerbell? Using Vagrant & VirtualBox, Vagrant & Libvirt, on Packet using Terraform, or give details:
Link to your project or a code example to reproduce issue:
Currently, CAPT tells Smee to not allowing a machine to network boot after it has been provisioned. This happens by CAPT setting 2 values in a Hardware object.
Hardware.Spec.Metadata.Instance.State
=provisioned
andHardware.Spec.Metadata.State
=in_use
.https://github.com/tinkerbell/cluster-api-provider-tinkerbell/blob/d828a9e7b165b4a2b0e0975ebd67b1a9f2a83d8c/controllers/machine_reconcile_scope.go#L52 https://github.com/tinkerbell/cluster-api-provider-tinkerbell/blob/d828a9e7b165b4a2b0e0975ebd67b1a9f2a83d8c/controllers/machine_reconcile_scope.go#L53 https://github.com/tinkerbell/cluster-api-provider-tinkerbell/blob/d828a9e7b165b4a2b0e0975ebd67b1a9f2a83d8c/controllers/machine_reconcile_scope.go#L110 https://github.com/tinkerbell/cluster-api-provider-tinkerbell/blob/d828a9e7b165b4a2b0e0975ebd67b1a9f2a83d8c/controllers/machine_reconcile_scope.go#L187 https://github.com/tinkerbell/cluster-api-provider-tinkerbell/blob/d828a9e7b165b4a2b0e0975ebd67b1a9f2a83d8c/controllers/machine_reconcile_scope.go#L723 https://github.com/tinkerbell/cluster-api-provider-tinkerbell/blob/d828a9e7b165b4a2b0e0975ebd67b1a9f2a83d8c/controllers/machine_reconcile_scope.go#L724
This was the case because Smee was using these fields to gate network booting. This is no longer the case in Smee (since v0.10.0). Gating of network booting in Smee now occurs via
Hardware.Spec.Interfaces[].Netboot.AllowPXE
.The affect of this is that when a machine reboots, if the firmware is setup to network boot first, then the machine will be served network boot packets from Smee and the machine will boot into HookOS and sit there indefinitely.
Expected Behaviour
A machine provisioned by CAPT should not network boot after a reboot (hardware configured to tell Smee not to netboot a machine).
Current Behaviour
Possible Solution
Update CAPT to set
Hardware.Spec.Interfaces[].Netboot.AllowPXE
=false
after a machine is provisioned.Steps to Reproduce (for bugs)
Context
Your Environment
Operating System and version (e.g. Linux, Windows, MacOS):
How are you running Tinkerbell? Using Vagrant & VirtualBox, Vagrant & Libvirt, on Packet using Terraform, or give details:
Link to your project or a code example to reproduce issue: