tinkerbell / cluster-api-provider-tinkerbell

Cluster API Infrastructure Provider
Apache License 2.0
100 stars 36 forks source link

Possible race condition #271

Open jacobweinstock opened 1 year ago

jacobweinstock commented 1 year ago

I observed today a possible race condition. The scenario was this. A machine is sitting in HookOS. Tink worker is running and connected to Tink server. CAPT crds are created, initiating the provisioning of the first control plane node. The code flow gets to a point where it creates a template and a workflow and then creates the Rufio power jobs to power the machine off, set the next boot device, and then power the machine on, reference. With this as the order of operations and in this scenario the workflow started before the machine powered off.

Expected Behaviour

The machine should be powered off before a workflow is created. Or maybe even better is the ability to not reboot the machine at all if it is already up and running in HookOS with a tink worker connected and ready to run a workflow.

Possible Solution

Steps to Reproduce (for bugs)

1. 2. 3. 4.

Context

Your Environment