crossplane-contrib / function-sequencer

Crossplane composition function to define sequencing rules delaying the creation of resources until other resources are ready.
Apache License 2.0
8 stars 3 forks source link

Multiple Match for sequence identifier #24

Open dukanto opened 3 weeks ago

dukanto commented 3 weeks ago

What problem are you facing?

Function Sequencer works great when all composition-resources name are known, since you know the names of the resources that will be created and, therefore, sort them in desired order. However, when building dynamic resources (i.e. in a loop in go-templating function), names are not static, and can't be added to sequencer, resulting in those resources not being taken into account in the function.

As a use case, imagine that you create an EKS cluster and N nodegroups (as many as the user desires). The logic order should be:

Since Nodegroups have to be created after Cluster is finished

How could this Function help solve your problem?

Some sort of "wildcard" could be added to the list, and treat all those resources as one entity in order to decide wether to continue or not. That is:

 - step: sequence-creation
    functionRef:
      name: function-sequencer
    input:
      apiVersion: sequencer.fn.crossplane.io/v1beta1
      kind: Input
      rules:
      - sequence:
        - cluster-role
        - eks-cluster
        - nodegroup-* # will treat all that begins with nodegroup-

So, if our composition creates 3 different nodegroups, function will wait for nodegroup-0, nodegroup-1 & nodegroup-2 to be all ready to continue to next step.

If this approach makes sense, I can provide a PR with an implementation for this :)

eric-carlsson commented 3 weeks ago

I have similar use cases and think the general idea is sound, but I also see some potential issues with the "wildcard approach". For example:

A (perhaps naive) approach would be to use the provided resource name as a standard regex pattern. That would allow expressing both simple and complex sequence relationships. The original example could then be expressed like:

 - step: sequence-creation
    functionRef:
      name: function-sequencer
    input:
      apiVersion: sequencer.fn.crossplane.io/v1beta1
      kind: Input
      rules:
      - sequence:
        - cluster-role
        - eks-cluster
        - "^nodegroup-\\d+$"

Which is all well defined regex. Alternatively we could accept an object instead of a string:

 - step: sequence-creation
    functionRef:
      name: function-sequencer
    input:
      apiVersion: sequencer.fn.crossplane.io/v1beta1
      kind: Input
      rules:
      - sequence:
        - cluster-role
        - eks-cluster
        - type: pattern
          pattern: "nodegroup-\\d+"

Which would let us maintain backwards compatibility. Such an approach could also let us specify an explicit strategy which would solve the lack of clarity mentioned in the last bullet point.

   - step: sequence-creation
      functionRef:
        name: function-sequencer
      input:
        apiVersion: sequencer.fn.crossplane.io/v1beta1
        kind: Input
        rules:
        - sequence:
          - type: pattern
            pattern: "^cluster-role-\\d+$"
          - type: pattern
            pattern: "^eks-cluster-\\d+$"
          - type: pattern
            pattern: "^eks-cluster-\\d+-nodegroup-\\d+$"
            strategy: matchAll # Explicitly require all resources matched earlier in the sequence to have been created. Alternatively this could be "matchOne" etc.

Bad example, but hopefully the idea is conveyed.

dukanto commented 3 weeks ago

Hi, @eric-carlsson You are right. I proposed the wildcard as a quick way to explain my point. I think that having a full regex approach will be far more clear.

  • Furthermore, how do we handle the presence of multiple wildcard characters? For example, consider a composition similar to the example above that creates multiple clusters. Do we allow something like this?
    - step: sequence-creation
      functionRef:
        name: function-sequencer
      input:
        apiVersion: sequencer.fn.crossplane.io/v1beta1
        kind: Input
        rules:
        - sequence:
          - cluster-role-*
          - eks-cluster-*
          - eks-cluster-*-nodegroup-*
  • What does it mean to have a sequence that contains multiple resources with wildcards? In the example above, does eks-cluster-* wait on all resources that match cluster-role-* or only one of them? Intuitively I would expect the earlier, but this should be explicit.

I think that every single pattern in sequence must wait on all resources that match on it. So every eks-role- should wait until all cluster-role- are ready.

Which is all well defined regex. Alternatively we could accept an object instead of a string:

 - step: sequence-creation
    functionRef:
      name: function-sequencer
    input:
      apiVersion: sequencer.fn.crossplane.io/v1beta1
      kind: Input
      rules:
      - sequence:
        - cluster-role
        - eks-cluster
        - type: pattern
          pattern: "nodegroup-\\d+"

Which would let us maintain backwards compatibility. Such an approach could also let us specify an explicit strategy which would solve the lack of clarity mentioned in the last bullet point.

   - step: sequence-creation
      functionRef:
        name: function-sequencer
      input:
        apiVersion: sequencer.fn.crossplane.io/v1beta1
        kind: Input
        rules:
        - sequence:
          - type: pattern
            pattern: "^cluster-role-\\d+$"
          - type: pattern
            pattern: "^eks-cluster-\\d+$"
          - type: pattern
            pattern: "^eks-cluster-\\d+-nodegroup-\\d+$"
            strategy: matchAll # Explicitly require all resources matched earlier in the sequence to have been created. Alternatively this could be "matchOne" etc.

Bad example, but hopefully the idea is conveyed.

In this approach I see a bit of a problem, I think that your approach is more extensible, of course, but it'd see it as something more generic like:

   - step: sequence-creation
      functionRef:
        name: function-sequencer
      input:
        apiVersion: sequencer.fn.crossplane.io/v1beta1
        kind: Input
        rules:
        - sequence:
          - type: pattern
             key: "^cluster-role-\\d+$"
          - type: exactMatch    # this would be default and can be omitted.
             key: "this-should-be-exact-match" 
          - type: pattern
            key: "^eks-cluster-\\d+$"

The thing I see here is if we want to keep some consistency, it'll be ideal to have some kind of uniformity in the elements of the sequence, but that would force users to migrate.

Here, I don't know which approach is best, but I'll be happy to work on any of them if everyone reachs an agreement on it. Perhaps @turkenh can provide better insight on this :)

eric-carlsson commented 3 weeks ago

I think that every single pattern in sequence must wait on all resources that match on it. So every eks-role- should wait until all cluster-role- are ready.

I don't think that's always the case. For example, when deploying an application or a piece of infrastructure with redundancy I often don't care if all instances are ready before sending them traffic, I just care that any of them are ready. I can agree that waiting on all matched resources to be ready is a sane default, but I also think we should allow users to change that behavior.

The thing I see here is if we want to keep some consistency, it'll be ideal to have some kind of uniformity in the elements of the sequence, but that would force users to migrate.

Perhaps an in-between solution is to promote the API to v2 with these changes and maintaining the v1 API with its current feature set.

callum-stakater commented 1 week ago

this is a needed feature to make this function usable and if added not just usable but extremely useful, as there seems to be a big shift underway from static function-patch-and-transform type compositions toward more dynamic approaches like the as mentioned go-templating or kcl etc and in current state this isnt usable

my opinion on the above discussion would be to keep it simple, allow regex expressions to match all and treat them like they were statically listed (eg wait for all matches), its then up to the user to name their resources in an efficient way if certain resources of a specific type dont need to be created in specific order - eg using a numbering scheme in resource names/ids for example and regex to match all resources of a numbered wave seems simple and effective