serverlessworkflow / specification

Contains the official specification for the Serverless Workflow Domain Specific Language. It provides detailed guidelines and standards for defining, executing, and managing workflows in serverless environments, ensuring consistency and interoperability across implementations.
http://serverlessworkflow.io
Apache License 2.0
740 stars 146 forks source link

Invalid `do:switch` #894

Closed JBBianchi closed 4 months ago

JBBianchi commented 4 months ago

What seems off: After #875, and because of, https://github.com/serverlessworkflow/specification/discussions/884#discussioncomment-9662801, #882 was merged, making do a single task entry. This breaks the possibility to use do:switch as there is nothing to switch to. It forces the user to use do:execute:sequentially:switch.

What you expected to be: We might need to rollback to the original idea of #875, which is to have do as map[string, task], but not only for top level.

Anything else we need to know?: As mentioned by @matthias-pichler-warrify in #875, making do a map[string, task], the execute.sequentially directive doesn't really make sense anymore. We might want to refactor execute at the same time (or keep that for another issue).

fjtirado commented 4 months ago

Can we do this for sequentiality

do:
   -task1:
   -task2: 

and this for concurrency

do:
   -task1:
   -task2: 
onParallel:
   compete: false

If we agree on the syntax, I think that will be doable schema wise (by putting do: array as required and onParallel: object as optional of a new schema construct)

JBBianchi commented 4 months ago

Discussion from #875 related to this issue:

@JBBianchi https://github.com/serverlessworkflow/specification/issues/875#issuecomment-2158650129

I think we overlooked the case of switch with the adopted solution. It's relegated to a second zone citizen, only available in execute:

do:
  execute:
    sequentially:
      - processOrder:
          switch:
          # ...

https://github.com/serverlessworkflow/specification/blob/main/dsl-reference.md#switch

Shouldn't we go back to do being a map[string, task][] instead of a single task ?

@cdavernas https://github.com/serverlessworkflow/specification/issues/875#issuecomment-2158657771

@JBBianchi Yes, indeed. Do as a mapping array is what we should opt for:

* It resolves consistency of the keyword `do`: it's alway a mapping array of tasks

* It allows for top-level switch statements

* It's more consistent with what a workflow is, to most people: a successions of multiple tasks, not a single top level one, which is then... a task

@matthias-pichler-warrify https://github.com/serverlessworkflow/specification/issues/875#issuecomment-2158697288

That is true. And do as an array is absolutely fine by me, however as @cdavernas pointed out: what's the purpose of sequentially then?

Which given the discussion around concurrently and the execute keyword in general, might mean:

* drop sequentially (not needed as we have arrays everywhere

* only keep `concurrently` (or replace with `branch` to make it a verb for consistency)

@cdavernas https://github.com/serverlessworkflow/specification/issues/875#issuecomment-2158732278

multitask /ˈmʌltɪtɑːsk/ verb

  1. (of a person) deal with more than one task at the same time. "I managed my time efficiently and multitasked"

It is a verb, though certainly not the best term. But there is only a little amount of alternatives, such as the (horrible) parallelize

The idea for branch also came from AWS StepFunctions (they run in parallel there)

Interesting. Might be worth digging.

@JBBianchi https://github.com/serverlessworkflow/specification/issues/875#issuecomment-2158762394

Do you mean:

do:
  execute:
    branch:
      - task1:
      - task2:
      - task3:    

or

do:
  branch:
    - task1:
    - task2:
    - task3: 

(an alternative to branch could also be fork)

In the first case, we keep the "controversial" execute but have an extra level to put config like compete: true. In the second one, we get rid of the controversial keyword but we might need an extra level of depth.

@fjtirado https://github.com/serverlessworkflow/specification/issues/875#issuecomment-2158878743

@JBBianchi Im trying to undestand the issue with swith. I understood that rather than writing

do:
  execute:
    sequentially:
      - processOrder:
          switch:
            - case1:
                when: .orderType == "electronic"
                then: processElectronicOrder
            - case2:
                when: .orderType == "physical"
                then: processPhysicalOrder
            - default:
                then: handleUnknownOrderType
      - processElectronicOrder:
          execute:
            sequentially:
              - validatePayment: {}
              - fulfillOrder: {}
          then: exit
      - processPhysicalOrder:
          execute:
            sequentially:
              - checkInventory: {}
              - packItems: {}
              - scheduleShipping: {}
          then: exit
      - handleUnknownOrderType:
          execute:
            sequentially:
              - logWarning: {}
              - notifyAdmin: {}

we should be able to write

do:
     - processOrder:
          switch:
            - case1:
                when: .orderType == "electronic"
                then: processElectronicOrder
            - case2:
                when: .orderType == "physical"
                then: processPhysicalOrder
            - default:
                then: handleUnknownOrderType
      - processElectronicOrder:
          execute:
            sequentially:
              - validatePayment: {}
              - fulfillOrder: {}
          then: exit
      - processPhysicalOrder:
          execute:
            sequentially:
              - checkInventory: {}
              - packItems: {}
              - scheduleShipping: {}
          then: exit
      - handleUnknownOrderType:
          execute:
            sequentially:
              - logWarning: {}
              - notifyAdmin: {}

isnt it? But it we remove execute, as option 2 in this discussion, it would be

do:
  sequentially:
     - processOrder:
          switch:
            - case1:
                when: .orderType == "electronic"
                then: processElectronicOrder
            - case2:
                when: .orderType == "physical"
                then: processPhysicalOrder
            - default:
                then: handleUnknownOrderType
      - processElectronicOrder:
            sequentially:
              - validatePayment: {}
              - fulfillOrder: {}
          then: exit
      - processPhysicalOrder:
            sequentially:
              - checkInventory: {}
              - packItems: {}
              - scheduleShipping: {}
          then: exit
      - handleUnknownOrderType:
           sequentially:
              - logWarning: {}
              - notifyAdmin: {}

Am I missing anything? (probably I am, my apologies in advance)

@JBBianchi https://github.com/serverlessworkflow/specification/issues/875#issuecomment-2158952423

@JBBianchi Im trying to undestand the issue with swith. I understood that rather than writing

do:
  execute:
    sequentially:
      - processOrder:
          switch:
            - case1:
                when: .orderType == "electronic"
                then: processElectronicOrder
            - case2:
                when: .orderType == "physical"
                then: processPhysicalOrder
            - default:
                then: handleUnknownOrderType
      - processElectronicOrder:
          execute:
            sequentially:
              - validatePayment: {}
              - fulfillOrder: {}
          then: exit
      - processPhysicalOrder:
          execute:
            sequentially:
              - checkInventory: {}
              - packItems: {}
              - scheduleShipping: {}
          then: exit
      - handleUnknownOrderType:
          execute:
            sequentially:
              - logWarning: {}
              - notifyAdmin: {}

we should be able to write

do:
     - processOrder:
          switch:
            - case1:
                when: .orderType == "electronic"
                then: processElectronicOrder
            - case2:
                when: .orderType == "physical"
                then: processPhysicalOrder
            - default:
                then: handleUnknownOrderType
      - processElectronicOrder:
          execute:
            sequentially:
              - validatePayment: {}
              - fulfillOrder: {}
          then: exit
      - processPhysicalOrder:
          execute:
            sequentially:
              - checkInventory: {}
              - packItems: {}
              - scheduleShipping: {}
          then: exit
      - handleUnknownOrderType:
          execute:
            sequentially:
              - logWarning: {}
              - notifyAdmin: {}

isnt it?

That's my concern indeed.

But it we remove execute, as option 2 in this discussion, it would be

do:
  sequentially:
     - processOrder:
          switch:
            - case1:
                when: .orderType == "electronic"
                then: processElectronicOrder
            - case2:
                when: .orderType == "physical"
                then: processPhysicalOrder
            - default:
                then: handleUnknownOrderType
      - processElectronicOrder:
          execute:
            sequentially:
              - validatePayment: {}
              - fulfillOrder: {}
          then: exit
      - processPhysicalOrder:
          execute:
            sequentially:
              - checkInventory: {}
              - packItems: {}
              - scheduleShipping: {}
          then: exit
      - handleUnknownOrderType:
          execute:
            sequentially:
              - logWarning: {}
              - notifyAdmin: {}

Am I missing anything? (probably I am, my apologies in advance)

The later example doesn't seem to match any late suggestions made in this issue. I didn't mention removing execute per se as it's another topic. But if I understand what @matthias-pichler-warrify and @cdavernas were talking about just after I raised my concern, is that sequentially is the default behavior of a do, therefore execute.sequentially doesn't really have a reason to exists. Which leaves us with concurrently, which could be replaced by branch(, multitask, parallelize or fork). What's unclear is if we keep execute.branch (or alternative) or we just get rid of execute altogether an opt for branch (or alternative) straight on - which then, rejoin the point 2 of #889 and makes the discussion obsolete. In the later case though, we'd probably need an extra level of depth for some params like racing (or threads count for instance).

e.g.:

do:
  - processOrder:
      switch:
        - case1:
            when: .orderType == "electronic"
            then: processElectronicOrder
        - case2:
            when: .orderType == "physical"
            then: processPhysicalOrder
        - default:
            then: handleUnknownOrderType
  - processElectronicOrder:
      do:
        - validatePayment: {}
        - fulfillOrder: {}
        then: exit
  - processPhysicalOrder:
      execute:
        compete: true
        branch:  # `multitask`, `parallelize` or `fork`
          - checkInventory: {}
          - packItems: {}
          - scheduleShipping: {}
            then: exit
  - handleUnknownOrderType: #...

or

do:
  - processOrder:
      switch:
        - case1:
            when: .orderType == "electronic"
            then: processElectronicOrder
        - case2:
            when: .orderType == "physical"
            then: processPhysicalOrder
        - default:
            then: handleUnknownOrderType
  - processElectronicOrder:
      do:
        - validatePayment: {}
        - fulfillOrder: {}
        then: exit
  - processPhysicalOrder:
      branch:
        compete: true
        on:  # `multitask`, `parallelize` or `fork`
          - checkInventory: {}
          - packItems: {}
          - scheduleShipping: {}
            then: exit
  - handleUnknownOrderType: #...

@cdavernas @matthias-pichler-warrify am I getting it right ?

@fjtirado https://github.com/serverlessworkflow/specification/issues/875#issuecomment-2160115152

@JBBianchi There was a typo in my last example, I edited it. My point was that this is kind of related with the solution for do: execute:, if we simplify how multitask is written, its not really an issue that switch is a task which only makes sense within a multitask (which is why we are proposing to make do: multitask again) Also, doing do: always a multitask (so if you want a single task you just write one named task, as far as this also applies to loops, retries and catch) and removing sequentially (because its implicit in do:), leaving concurrently as an special case of multitask will also work.

@cdavernas https://github.com/serverlessworkflow/specification/issues/875#issuecomment-2160130745

its not really an issue that switch is a task which only makes sense within a multitask

I strongly disagree: it is a problem, and a big one. The fact that do could have different meaning is blocking for you in terms of consistency, but not the fact that the switch task - and the switch task only - cannot be used at top level? Furthermore, as said above, a single task per workflow goes against the definition of a workflow. It also is a problem when it comes to visualization.

(which is why we are proposing to make do: multitask again)

That is not what we are proposing. We are proposing:

1. To make `do` a (sequential) array everywhere

2. To get rid of `execute/sequentially`

3. Create a new task to "parallelize" tasks

@fjtirado https://github.com/serverlessworkflow/specification/issues/875#issuecomment-2160144367

@cdavernas I think we are are saying the same thing, so there is not need to strongly disagree ;) When I said that we are proposing to do a multitask I mean do: being an array (which I said is fine as far as do: have the same syntax everywhere, not only in main flow) And removing execute/sequentially is part of the do:execute: redundancy discussion.

@cdavernas https://github.com/serverlessworkflow/specification/issues/875#issuecomment-2160148819

I think we are are saying the same thing, so there is not need to strongly disagree ;)

Hehe, sorry, seems I misunderstood you indeed 🤣

And removing execute/sequentially is part of the do:execute: redundancy discussion.

Indeed.

matthias-pichler commented 4 months ago

@JBBianchi with this example:

do:
  - processOrder:
      switch:
        - case1:
            when: .orderType == "electronic"
            then: processElectronicOrder
        - case2:
            when: .orderType == "physical"
            then: processPhysicalOrder
        - default:
            then: handleUnknownOrderType
  - processElectronicOrder:
      do:
        - validatePayment: {}
        - fulfillOrder: {}
        then: exit
  - processPhysicalOrder:
      branch:
        compete: true
        on:  # `multitask`, `parallelize` or `fork`
          - checkInventory: {}
          - packItems: {}
          - scheduleShipping: {}
            then: exit
  - handleUnknownOrderType: #...

I think you are mixing some proposals together because now here:

 - processElectronicOrder:
      do:
        - validatePayment: {}
        - fulfillOrder: {}
        then: exit

do appears as a task which was only discusses in the sense of renaming execute to do.

But actually it makes sense here as a substitute for the lost execute:sequentially

matthias-pichler commented 4 months ago

But it seems that on this one we are already very close to a consensus:

  1. To make do a (sequential) array everywhere

that includes the main flow, for and catch. Is there still another way to then define sequential actions? Because it leaves us with no alternative for try and before & after in extensions. So either need to make these arrays as well or have some execute/sequentially alternative ... @JBBianchi proposed do itself:

do:
  - processOrder:
      switch:
        - case1:
            when: .orderType == "electronic"
            then: processElectronicOrder
        - case2:
            when: .orderType == "physical"
            then: processPhysicalOrder
        - default:
            then: handleUnknownOrderType
  - processElectronicOrder:
      do: # do instead of execute:sequentially
        - validatePayment: {}
        - fulfillOrder: {}
      then: exit
  - processPhysicalOrder: #...
  - handleUnknownOrderType: #...
  1. To get rid of execute/sequentially

that makes absolute sense to me because in my head "doing a set of tasks in order" should be the default mode for a workflow anyway. We just need to decide on how to substitute all its different use-cases, namely in try, before, after and nested tasks.

  1. Create a new task to "parallelize" tasks

that also makes sense and we already have a couple of good ideas how to structure and name it (fork, branch, multitask, parallelize) 👍

cdavernas commented 4 months ago

that includes the main flow, for and catch. Is there still another way to then define sequential actions? Because it leaves us with no alternative for try and before & after in extensions. So either need to make these arrays as well or have some execute/sequentially alternative

I think it should also apply to try, before and after. IMHO there's no reason a user should be not be able to use top level switch in those cases.

matthias-pichler commented 4 months ago

that includes the main flow, for and catch. Is there still another way to then define sequential actions? Because it leaves us with no alternative for try and before & after in extensions. So either need to make these arrays as well or have some execute/sequentially alternative

I think it should also apply to try, before and after. IMHO there's no reason a user should be not be able to use top level switch in those cases.

agreed. But what should we do about nested sequential tasks? like @JBBianchi tried to show here:

 - processElectronicOrder:
      do:
        - validatePayment: {}
        - fulfillOrder: {}
      then: exit
cdavernas commented 4 months ago

agreed. But what should we do about nested sequential tasks?

Exactly what @JBBianchi proposed I guess: basically use do to define a sequential composite task.

So, to summarize for sake of brievety, here's my understanding of what's proposed:

  1. Make do a (sequential) task array everywhere
  2. Make try, before and after a (sequential) task array
  3. Create a new do task, used to define a nested (sequential) task array
  4. Create a new fork/branch/multitask/parallelize task, used to define a parallel (and possibly competing) task array
fjtirado commented 4 months ago

@cdavernas A variant for 4) will be to use do: both for sequential and concurrency, as proposed here https://github.com/serverlessworkflow/specification/issues/894#issuecomment-2160183420.

cdavernas commented 4 months ago

@cdavernas A variant for 4) will be to use do: both for sequential and concurrency, as proposed here https://github.com/serverlessworkflow/specification/issues/894#issuecomment-2160183420.

Nice idea! I'm however not a fan of the suggestion, for multiple reasons:

matthias-pichler commented 4 months ago

@cdavernas A variant for 4) will be to use do: both for sequential and concurrency, as proposed here #894 (comment).

Nice idea! I'm however not a fan of the suggestion, for multiple reasons:

  • It is made out of two words
  • It is not a verb, and does not respect imperative tense
  • As @matthias-pichler-warrify mentionned on another issue, I would avoid having qualified properties at top level (ex: document/use/evaluate/do/onParallel <= the later feels off, and inconsistent)

I am also more a fan of a dedicated task type for parallel execution since

fjtirado commented 4 months ago

I came from an OO world, so let me try to justfy why I feel adding a property to same object where do: is defined is a good alternative to some new keyword that reflect the concept of do concurrently. I think we agree both do: and concurrently: (Im using the keyword for conveninece, we can use a different name) are array of task. do: is an array of task that is executed sequentially concurrently: is an array of task that is executed concurrently (concurrency might be configured) Therefore it is not crazy to consider concurrently: an extension of do: and rather than redering it as a different keyword, render both as the same object with different internal state. In Json schema, that means defining a new object that contains do: as array of task and optional properties that setups how this array of task is run. Sequenatially if there is not any extra property. And concurrently if we write the addional property. That approach allow us, eventually, to add new properties that qualifies do: (the array of tasks) that are both shared by sequential and concurrent execution (we feel there are none, but we might be wrong) The optional property onParallel: (we already have properties that are not vebs, like with, as...) is the place to add properties that are specific for concurrent execution If we find out another way to execute the task (for example onReverse:) we just add the new optional property.

fjtirado commented 4 months ago

Anyway, I guess we are taking this approach

concurrently (or any other word):
   compete:
   #any additional property like compete
   branch (or some other word to write the array): 
      - task 1: 
     - task 2: 

That also works for me (my previous message was just for the record ;))

cdavernas commented 4 months ago

I'm myself an OO dev, I understand your (very good and valid) point of view.

My preference still remains the same. I do not like having the workflow becoming some kind of a do task with additional properties. I do not like having a non verb top level property. I also feel, as said by @matthias-pichler-warrify, that the proposed solution is much more intuitive. It is also much more consistent: feature get its own container class.

@JBBianchi @matthias-pichler-warrify @ricardozanini surely we can reach a consensus?

matthias-pichler commented 4 months ago

I came from an OO world, so let me try to justfy why I feel adding a property to same object where do: is defined is a good alternative to some new keyword that reflect the concept of do concurrently. I think we agree both do: and concurrently: (Im using the keyword for conveninece, we can use a different name) are array of task. do: is an array of task that is executed sequentially concurrently: is an array of task that is executed concurrently (concurrency might be configured) Therefore it is not crazy to consider concurrently: an extension of do: and rather than redering it as a different keyword, render both as the same object with different internal state. In Json schema, that means defining a new object that contains do: as array of task and optional properties that setups how this array of task is run. Sequenatially if there is not any extra property. And concurrently if we write the addional property. That approach allow us, eventually, to add new properties that qualifies do: (the array of tasks) that are both shared by sequential and concurrent execution (we feel there are none, but we might be wrong) The optional property onParallel: (we already have properties that are not vebs, like with, as...) is the place to add properties that are specific for concurrent execution If we find out another way to execute the task (for example onReverse:) we just add the new optional property.

Thanks for your clarification. Very good points 👍 Although my Java days lie a couple of years in the past, I feel like there is an equally justified but opposite view on this. In Java the fundamental ordered collection is a List and the fundamental unordered collection a Set (although it does come with additional constraints)

classDiagram
  class Iterable
  <<interface>> Iterable

  class Collection
  <<interface>> Collection

  class List
  <<interface>> List

  class Set
  <<interface>> Set

  Iterable <|-- Collection
  Collection <|-- List
  Collection <|-- Set

so I feel like it depends on interpretation where one places do and concurrently in this hierarchy. I personally see them as fundamentally ordered and unordered and therefore they are siblings but not direct parents/children.

I'm myself an OO dev, I understand your (very good and valid) point of view.

My preference still remains the same. I do not like having the workflow becoming some kind of a do task with additional properties. I do not like having a non verb top level property. I also feel, as said by @matthias-pichler-warrify, that the proposed solution is much more intuitive. It is also much more consistent: feature get its own container class.

@JBBianchi @matthias-pichler-warrify @ricardozanini surely we can reach a consensus?

So yeah I would prefer a dedicated keyword.

fjtirado commented 4 months ago

@matthias-pichler-warrify I agree we can do with a dedicated keyword (so in essence we have a keyword for sequential list of task and another keyword for parallel list of task) As a not related discussion, just for our sharing understanding, more than ordered and unordered, what distinguish a List (which is indeed fundamentally ordered) from a Set (note that SortedSet is still a Set and sorting implies an order) in Java is that a List can be accessed by position (or in other words, any item in the list, at a certain moment, has a well known position identified by an index) while there is not such concept of position in a Set (and also that any element in the Set is unique), even when the Set is ordered. To be honest, I prefer the STL approach, which distinguish between sequence containers (Java List) and associative containers (Java Set and Map). PS: I would deny I have written this, but I think Python nailed it https://substackcdn.com/image/fetch/w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43ebfa48-5aaa-4733-baf7-ce42e54b8218_3804x2964.png

fjtirado commented 4 months ago

@matthias-pichler-warrify I made some research regarding json schema in order to see if it was feasible to write documents like

document:
  dsl: 1.0.0-alpha1
  namespace: default
  name: composite-sequential
  version: 1.0.0
do:
  set:
    colors: ${ .colors + ["red"] }

or


document:
  dsl: 1.0.0-alpha1
  namespace: default
  name: composite-sequential
  version: 1.0.0
do:
  sequentially:
  - setRed:
      set:
        colors: ${ .colors + ["red"] }
  - setGreen:
      set:
        colors: ${ .colors + ["green"] }
  - setBlue:
      set:
        colors: ${ .colors + ["blue"] }

So, removing "do" from top level, "do" as a task (just rename do as execute to experiment), add a reference to the"task" object as it is currently in the schema from the top. It seems to work.

I'm bringing up here because although this is not going to be the new schema, I believe do: (and the keyword for concurrent), in the modified form we are discussing, should be a task. And the top level should refer to either any task or just the do/concurrent task.

image

Basically, its removing do and adding task

oneOf:
- "$ref": "#/$defs/task"
required:
- document

I include the whole schema I used for reference

matthias-pichler commented 4 months ago

@fjtirado I never claimed it wasn't possible. With your modification of moving do into the composite task you could also write it as

type: object
required:
  - document
properties:
  document: {}
  use: {}
  ...
allOf:
  - $ref: "#/$defs/task"

which given the semantics of JSON schema would be the equivalent but more idiomatic way of doing it. But I thought it was thoroughly discussed that we did not want to make the top level workflow a task itself, because with your schema the following is valid:

document:
  dsl: 1.0.0-alpha1
  namespace: default
  name: composite-sequential
  version: 1.0.0
call: http
with:
  method: get
  endpoint: '...'

which defines a call task at the top level. Furthermore it does not answer the question what should happen to the other occurrences of do such as in catch or for

fjtirado commented 4 months ago

@matthias-pichler-warrify I understood that it was not possible, sorry Since in order to reuse do/concurrently (in the form we are discussing now, not the one in the current schema that I used in the example) I think the easiest way is to declared them as task (so you can reference them in try/loop/...) and from the top level just allow do/concurrently So you do not use do: explicitly, you just specify that try/loop/catch accepts a task (and do/concurrently is defined in the task)

ricardozanini commented 4 months ago

agreed. But what should we do about nested sequential tasks?

Exactly what @JBBianchi proposed I guess: basically use do to define a sequential composite task.

So, to summarize for sake of brievety, here's my understanding of what's proposed:

  1. Make do a (sequential) task array everywhere
  2. Make try, before and after a (sequential) task array
  3. Create a new do task, used to define a nested (sequential) task array
  4. Create a new fork/branch/multitask/parallelize task, used to define a parallel (and possibly competing) task array

Nice summary, I think we already reached a consensus then.

@JBBianchi are you opening the PR since you opened the issue?

@matthias-pichler-warrify @fjtirado I understand that you guys are discussing the implementation in the schema, but the idea has been settled, correct?

matthias-pichler commented 4 months ago

agreed. But what should we do about nested sequential tasks?

Exactly what @JBBianchi proposed I guess: basically use do to define a sequential composite task. So, to summarize for sake of brievety, here's my understanding of what's proposed:

  1. Make do a (sequential) task array everywhere
  2. Make try, before and after a (sequential) task array
  3. Create a new do task, used to define a nested (sequential) task array
  4. Create a new fork/branch/multitask/parallelize task, used to define a parallel (and possibly competing) task array

Nice summary, I think we already reached a consensus then.

@JBBianchi are you opening the PR since you opened the issue?

@matthias-pichler-warrify @fjtirado I understand that you guys are discussing the implementation in the schema, but the idea has been settled, correct?

currently writing something on this ... just a sec

fjtirado commented 4 months ago

@ricardozanini Yes, Im discussing how to implement the proposal with two keywords, do: and other word for concurrently in the schema

matthias-pichler commented 4 months ago

@fjtirado In your understanding should it be possible to have a top level concurrently

document: {}
concurrently:
  - task1: {}
  - task2: {}

vs

document: {}
do:
  - task1:
      concurrently:
        - task2: {}
        - task3: {}
fjtirado commented 4 months ago

@matthias-pichler-warrify Yes, I guess is reasonable to accept concurrently at top level, but if we want to prevent that, we can just refer the do task at top level Regarding try/catch/loop, I will just reference the task object there (so user can write single task or multi task with do or concurrently)

cdavernas commented 4 months ago

Yes, I guess is reasonable to accept concurrently at top level, but if we want to prevent that, we can just refer the do task at top level

I don't think it's acceptable, for the reasons explained above.

Regarding try/catch/loop, I will just reference the task object there (so user can write single task or multi task with do or concurrently)

Doing so would forbid the user to use a switch with those constructs.

matthias-pichler commented 4 months ago

@fjtirado

for the implementation in the schema it is relatively straightforward (IMO)

$schema: ""
type: object
required:
  - document
properties: 
  document: {}
  do:
    type: array
    items:
      type: object
      minProperties: 1
      maxProperties: 1
      additionalProperties:
        $ref: "#/$defs/task"
$defs:
  task:
    type: object
    properties: {}
    oneOf: [...]
  forTask:
    type: object
    required:
      - for
      - do
    properties:
      for: {}
      do:
        type: array
        items:
          type: object
          minProperties: 1
          maxProperties: 1
          additionalProperties:
            $ref: "#/$defs/task"
  tryTask:
    type: object
    required:
      - try
    properties:
      try:
        type: array
        items:
          type: object
          minProperties: 1
          maxProperties: 1
          additionalProperties:
            $ref: "#/$defs/task"
  # similar for catch
  branchTask:
    type: object
    required:
      - branch
    properties:
      branch:
        type: array
        items:
          type: object
          minProperties: 1
          maxProperties: 1
          additionalProperties:
            $ref: "#/$defs/task"
  doTask:
    type: object
    required:
      - do
    properties:
      do:
        type: array
        items:
          type: object
          minProperties: 1
          maxProperties: 1
          additionalProperties:
            $ref: "#/$defs/task"

this schema would allow for the following definition

document: {}
do:
  - branchTask:
      branch:
        - doTask:
            do:
              - task1: {}
              - task2: {}
        - task3: {}
  - switchTask:
      switch:
        - case1:
            when: "..."
            then: "forTask"
        - case2:
            when: "..."
            then: "tryTask"
  - forTask:
      for: {}
      do:
        - task4: {}
        - task5: {}
  - tryTask:
      try:
        - task6: {}
        - task7: {}
      catch:
        do:
          - task8: {}
          - task9: {}
matthias-pichler commented 4 months ago

@matthias-pichler-warrify Yes, I guess is reasonable to accept concurrently at top level, but if we want to prevent that, we can just refer the do task at top level

I would not add concurrently at the top. And do should not be a single task.

Regarding try/catch/loop, I will just reference the task object there (so user can write single task or multi task with do or concurrently)

These should IMO not refer to a task object but to a task array. Just as the top level do

cdavernas commented 4 months ago

100% agree with @matthias-pichler-warrify

fjtirado commented 4 months ago

@matthias-pichler-warrify I like that schema ;)

Just for claryfing what I have in mind, I was thinking

document: {}
do:
  - branchTask:
      branch:
        - task1: {}
        - task2: {}
  - switchTask:
      switch:
        - case1:
            when: "..."
            then: "forTask"
        - case2:
            when: "..."
            then: "tryTask"
  - forTask:
      for: {}
      do:
        - task3: {}
        - task4: {}
  - tryTask:
      try:
        do:
           - task5: {}
           - task6: {}
      catch:
         call: http

I was trying to maintain Charles original idea of not naming single task in try/catch/loop, but Im perfectly fine with your proposal.

JBBianchi commented 4 months ago

I think you last suggestion is indeed what seems the clearest @matthias-pichler-warrify

ricardozanini commented 4 months ago

@matthias-pichler-warrify can you open a PR?

matthias-pichler commented 4 months ago

@fjtirado if your goal is maximum schema reuse one could do

$schema: ""
type: object
required:
  - document
properties: 
  document: {}
  do:
    $ref: "#/$defs/taskList"
$defs:
  task:
    type: object
    properties: {}
    oneOf: [...]
  taskList:
    type: array
    items:
      type: object
      minProperties: 1
      maxProperties: 1
      additionalProperties:
        $ref: "#/$defs/task"
  forTask:
    type: object
    required:
      - for
      - do
    properties:
      for: {}
      do:
        $ref: "#/$defs/taskList"
  tryTask:
    type: object
    required:
      - try
    properties:
      try:
        $ref: "#/$defs/taskList"
  # similar for catch
  branchTask:
    type: object
    required:
      - branch
    properties:
      branch:
        $ref: "#/$defs/taskList"
  doTask:
    type: object
    required:
      - do
    properties:
      do:
        $ref: "#/$defs/taskList"
matthias-pichler commented 4 months ago

@matthias-pichler-warrify can you open a PR?

👍 sure thing, are we moving forward with my last suggestion though? there was a bit of back and forth on my last PR 😅

fjtirado commented 4 months ago

@matthias-pichler-warrify My goal was to allow not forcing naming of single task (I think that goal was achievable while keeping the schema simple), but, again, I have not issue with naming all tasks (if you want single task, you just write an array of one). Im more concerned that, in order to write a fully concurrent workflow, user has to define a single sequential task first (thats why I believe using do/concurrently as top level makes more sense than just allowing do)

matthias-pichler commented 4 months ago

@matthias-pichler-warrify My goal was to allow not forcing naming of single task (I think that goal was achievable while keeping the schema simple), but, again, I have not issue with naming all tasks (if you want single task, you just write an array of one).

I understand, the schema to do this properly would be:

$schema: ""
type: object
required:
  - document
properties: 
  document: {}
  do:
    oneOf:
      - $ref: "#/$defs/task"
      - $ref: "#/$defs/taskList"
$defs:
  task:
    type: object
    properties: {}
    oneOf: [...]
  taskList:
    type: array
    items:
      type: object
      minProperties: 1
      maxProperties: 1
      additionalProperties:
        $ref: "#/$defs/task"
  forTask:
    type: object
    required:
      - for
      - do
    properties:
      for: {}
      do:
        oneOf:
          - $ref: "#/$defs/task"
          - $ref: "#/$defs/taskList"
  tryTask:
    type: object
    required:
      - try
    properties:
      try:
        oneOf:
          - $ref: "#/$defs/task"
          - $ref: "#/$defs/taskList"
  # similar for catch
  branchTask:
    type: object
    required:
      - branch
    properties:
      branch:
        $ref: "#/$defs/taskList"
  doTask:
    type: object
    required:
      - do
    properties:
      do:
        $ref: "#/$defs/taskList"

we could also implement that and I'd not really have an issue with that but I think it's just weird that although all do follow the same schema that could have two different options (array or single task) in the same workflow. Plus I believe this could cause some trouble for people working in OO languages.

fjtirado commented 4 months ago

@matthias-pichler-warrify I was thinking in

$schema: ""
type: object
required:
  - document
oneOf:
      - $ref: "#/$defs/doTask"
      - $ref: "#/$defs/branchTask" 
properties: 
  document: {}
 $defs:
  task:
    type: object
    properties: {}
    oneOf: [...]
  taskList:
    type: array
    items:
      type: object
      minProperties: 1
      maxProperties: 1
      additionalProperties:
        $ref: "#/$defs/task"
  forTask:
    type: object
    required:
      - for
    properties:
      for: {}
      allOf:
         - $ref: "#/$defs/task"
  tryTask:
    type: object
    required:
      - try
    properties:
      try: $ref: "#/$defs/task"
  # similar for catch
  branchTask:
    type: object
    required:
      - branch
    properties:
      branch:
        $ref: "#/$defs/taskList"
  doTask:
    type: object
    required:
      - do
    properties:
      do:
        $ref: "#/$defs/taskList"

And then you write


document: {}
do:
  - branchTask:
      branch:
        - task1: {}
        - task2: {}
  - switchTask:
      switch:
        - case1:
            when: "..."
            then: "forTask"
        - case2:
            when: "..."
            then: "tryTask"
  - forTask:
      for: {}
      do:
        - task3: {}
        - task4: {}
  - tryTask:
      try:
        do:
           - task5: {}
           - task6: {}
      catch:
         call: http
matthias-pichler commented 4 months ago

@matthias-pichler-warrify I was thinking in

$schema: ""
type: object
required:
  - document
properties: 
  document: {}
   oneOf:
      - $ref: "#/$defs/doTask"
      - $ref: "#/$defs/branchTask"
$defs:
  task:
    type: object
    properties: {}
    oneOf: [...]
  taskList:
    type: array
    items:
      type: object
      minProperties: 1
      maxProperties: 1
      additionalProperties:
        $ref: "#/$defs/task"
  forTask:
    type: object
    required:
      - for
      - do
    properties:
      for: {}
      do:
          - $ref: "#/$defs/task"
  tryTask:
    type: object
    required:
      - try
    properties:
      try:
         $ref: "#/$defs/task"
  # similar for catch
  branchTask:
    type: object
    required:
      - branch
    properties:
      branch:
        $ref: "#/$defs/taskList"
  doTask:
    type: object
    required:
      - do
    properties:
      do:
        $ref: "#/$defs/taskList"

And then you write


document: {}
do:
  - branchTask:
      branch:
        - task1: {}
        - task2: {}
  - switchTask:
      switch:
        - case1:
            when: "..."
            then: "forTask"
        - case2:
            when: "..."
            then: "tryTask"
  - forTask:
      for: {}
      do:
        - task3: {}
        - task4: {}
  - tryTask:
      try:
        do:
           - task5: {}
           - task6: {}
      catch:
         call: http

except it would be

for:
  do:
    do:
      - taks1: {}
      - taks2: {}

with your schema, you'd need:

   forTask:
     type: object
     required:
       - for
       - do
     properties:
       for: {}
       do:
          oneOf:
           - $ref: "#/$defs/task"
           - $ref: "#/$defs/taskList"

and this part is incorrect:

$schema: ""
type: object
required:
  - document
properties: 
  document: {}
   oneOf:
      - $ref: "#/$defs/doTask"
      - $ref: "#/$defs/branchTask"

it should be

$schema: ""
type: object
required:
  - document
oneOf:
      - $ref: "#/$defs/doTask"
      - $ref: "#/$defs/branchTask"
properties: 
  document: {}
fjtirado commented 4 months ago

@matthias-pichler-warrify I removed the do with a latter edition (we should discuss this on slack, which is more agile, not here)

with that you can write

- forTask:
      for: {}
      call: http
matthias-pichler commented 4 months ago

@matthias-pichler-warrify I removed the do with a latter edition (we should discuss this on slack, which is more agile, not here)

with that you can write

- forTask:
      for: {}
      call: http

how do I now nest loops?

matthias-pichler commented 4 months ago

@fjtirado I believe we are going in circles ... all this was discussed before

fjtirado commented 4 months ago

@matthias-pichler-warrify Since forTask is a task and forTask accept any task, you can nest loops (basically it is recursive definition)

fjtirado commented 4 months ago

@matthias-pichler-warrify I believe we need a quick chat on slack or zulip, How can I talk with you more fluently?

matthias-pichler commented 4 months ago

@matthias-pichler-warrify Since forTask is a task and forTask accept any task, you can nest loops.

it would need to be:

- forTask:
   for: {}
   do:
     - forTask2:
         for: {}
         call: http

no I need a do task just to nest loops?!

matthias-pichler commented 4 months ago

@matthias-pichler-warrify I believe we need a quick chat on slack or zulip, How can I talk with you more fluently?

Slack but how do I join the workspace?

JBBianchi commented 4 months ago

@matthias-pichler-warrify I believe we need a quick chat on slack or zulip, How can I talk with you more fluently?

Slack but how do I join the workspace?

It's the CNCF slack workspace, channel #serverless-workflow (https://communityinviter.com/apps/cloud-native/cncf I think ?)

fjtirado commented 4 months ago

Ok, after discussing it in slack. We rule out embedding task definition in for/try becuase of possible colision between task properties (although it was clarified that nested loops are allowed in that proposal ;)) and because of the swith problem (which we agree is anyway a semantic issue with arrays) As a consequence of that, there is no room for unnamed single task any longer. Therefore, lets implement https://github.com/serverlessworkflow/specification/issues/894#issuecomment-2161051207. We also discuss the possibility of having a slightly modified version of branch at top level, @matthias-pichler-warrify will give that a thought and, if needed, open a new issue after completing the PR associated with this one.

matthias-pichler commented 4 months ago

then I will get started with the PR 👍

ricardozanini commented 4 months ago

Thank you @matthias-pichler-warrify @fjtirado 🙏

matthias-pichler commented 4 months ago

we actually still need to pick a name from:

Create a new fork/branch/multitask/parallelize task, used to define a parallel (and possibly competing) task array

JBBianchi commented 4 months ago

we actually still need to pick a name from:

Create a new fork/branch/multitask/parallelize task, used to define a parallel (and possibly competing) task array

I have a little preference for fork or branch but I don't have a strong opinion about it.

fjtirado commented 4 months ago

@JBBianchi Same here