argoproj / argo-workflows

Workflow Engine for Kubernetes
https://argo-workflows.readthedocs.io/
Apache License 2.0
14.89k stars 3.17k forks source link

semaphore with dynamic key+number #11890

Open tooptoop4 opened 12 months ago

tooptoop4 commented 12 months ago

Summary

Currently mutex allows a dynamic key (example incoming tablename from an argo event). But only allows max 1 wf to consume it. Currently semaphore requires a static key name (pre-defined in configmap) but allows n number of wfs to consume it.

This new proposal is about having the flexibility from both worlds. ie a semaphore that allows dynamic key (not pre-defined in configmap) and ability to specify n wfs to consume it.

I imagine on a sensor it could look like:

  synchronization:
    semaphore:
      name: "{{=...parse_tbl from sensor message}}"
      maxInParallel: 3

also allow semaphore to be applied conditionally based on an expression? ie if somearg='x' then semaphore of 'y' applied else don't apply any semaphore. alternatively can it fallback to a mutex if there is no semaphore with that name defined in the wfcontroller/configmap ?

Use Cases

To dynamically rate limit the platform, avoid new tables being resource hogs

agilgur5 commented 11 months ago

Hmm I think this proposal has a bit of an issue -- maxInParallel is defined on one Workflow in the example, but it can actually be defined on any Workflow.

The ConfigMap for semaphores acts as a single source of truth between all Workflows that use the same semaphore.

I suppose this could be rewritten where only the number is taken from a ConfigMap (and not the key), something like customKey or overrideKey, but that is perhaps a bit awkward/unintuitive. Might be good to think of a more ergonomic API for this

tooptoop4 commented 11 months ago

maxInParallel in this case could even be specific to the workflow for my usecase. don't mind having the wfprefix name appended to the semaphore name, the key thing being that a new key name (ie table value could appear in sensor expression) but me not having to statically define that name beforehand

agilgur5 commented 11 months ago

maxInParallel in this case could even be specific to the workflow for my usecase

Hmm that is a slightly different use-case. That straddles in between the current parallelism feature and semaphores.

not having to statically define that name beforehand

But yea I understand the use-case and do think it's not altogether that unique and therefore broadly useful. There are some small gaps (like this one and multi-semaphore) in the current synchronization API that some use-cases aren't quite covered by.

The ergonomics of the API and how that compares to existing APIs would be the main challenge now. If there are multiple Workflows, then the ConfigMap I still think makes the most sense and would be most familiar. Perhaps valueFromKey while key can be overwritten. I don't really like any of the options I proposed so far, so if you have any ideas on that, please do list them!

agilgur5 commented 4 months ago

Dynamic keys were also suggested in https://github.com/argoproj/argo-workflows/pull/7302#issuecomment-982917593