tinkerbell / roadmap

Official Tinkerbell Roadmap
Apache License 2.0
7 stars 3 forks source link

Introduce `WorkflowSet` and `HardwareRuleSet` CRDs #40

Open jacobweinstock opened 1 month ago

jacobweinstock commented 1 month ago

Currently, Workflows have to be created using a 1:1 mapping between Hardware and Workflow. This has been the case since the beginning. Workflow creation is left up to the user. For large deployments this can be challenging. I propose we build on top of the existing Workflow object and build the capability to have the Stack do a 1:many mapping between Hardware and Workflow. This opens up many new possibilities and even integration with auto capabilities.

The idea is that a user can define a WorkflowSet object and the Tink controller (or something else) will use the object in order to create >= 1 Workflow object(s). This significantly improves the user experience around large batch creation of Workflows.

Some of the technical details aren't fully formed yet. You'll see that in the comments below. I will update this issue as the details become more fully formed.

New CRDs

WorkflowSet

For each hardware object create a workflow object if an existing (exact match? hardware ref already exists?) workflow object does not exist. Use the pause annotation to pause creating workflow objects. Tink worker matching: The Hardware object must provide a unique identifier. the namespace/name for the Hardware object is unique but might not be usable for the tink worker id. It could be the "first" mac address. There could be a field in the Hardware object that defines the unique identifier. This identifier needs to be coordinated with the Tink worker and Smee (Smee sets the ID in kernel parameters).

---
apiVersion: tinkerbell.org/v1alpha1
kind: WorkflowSet
metadata:
  annotations:
    tinkerbell.org/pause: "false"
  name: set1
  namespace: tink
spec:
  HardwareRuleSetRefs:
    - name: ruleset1
      namespace: tink
  TemplateRef:
    name: template1
    namespace: tink
  MaxWorkflows: 5

HardwareRuleSet - CRD

the result of matching Hardware against the ruleset will be a list of Hardware objects.

---
apiVersion: tinkerbell.org/v1alpha1
kind: HardwareRuleSet
metadata:
  name: ruleset1
  namespace: tink
spec:
  operation: AND # OR
  rules:
    - label: kubernetes.io/arch
      value: amd64
      type: string # int, bool, float
      matchExpression: "=="
chrisdoherty4 commented 1 month ago

Whoop. Great to see this.

Use the pause annotation to pause creating workflow objects

What motivated the annotation ahead of a field?

HardwareRuleSet

What's the rational for not embedding this as part of the WorkflowSet?

jacobweinstock commented 1 month ago

Hey @chrisdoherty4

What motivated the annotation ahead of a field?

It's just what i've seen from other controllers. Some CAPI controllers, for example. I actually haven't dove into the trade offs around this much. Definitely open to other ways like a field. If you have any experience, preference, etc please do share :)

What's the rational for not embedding this as part of the WorkflowSet?

The idea is to make HardwareRuleSet's reusable across WorkflowSet's. For example, There could be a HardwareRuleSet for x86_64 machines, one for machines with a certain type of hardware, and one for machines in a specific datacenter, rack, etc. Then multiple WorkflowSet's can reuse these to target machines in different ways. That was the idea, open to the alternative of embedding as i know it would be one less CRD and less work on the backend.

jacobweinstock commented 1 month ago

Another thing to possibly add here would be the ability to specify some kind of anti-affinity rules. That way if we want 5 machines and want them to all be in their own failure domains then we could. For example, a rack or datacenter anti-affinity. I'll be thinking about how this might look and about adding it.