redpanda-data / connect

Fancy stream processing made operationally mundane
https://docs.redpanda.com/redpanda-connect/about/
8.13k stars 831 forks source link

Alternative workflow processor behavior when order is used #755

Open advancedlogic opened 3 years ago

advancedlogic commented 3 years ago

Currently the workflow processor runs branches trying to infere the DAG automatically or using order. The behavior of order prioritizes parallelism over sequential so in [ [ A, B], [C] ], A and B are processed simultaneously and then C is executed. When the graph becomes complex, it's hard to keep track of every parallel processing and it would be easier to describe the order as a graph with nodes and edges. Considering the example in the documentation:


      /--> B -------------|--> D
     /                   /
A --|          /--> E --|
     \--> C --|          \
               \----------|--> F

described as: [ [ A ], [ B, C ], [ E ], [ D, F ] ]

could become a list of edges:

[ [ A, B], [A, C], [B, D], [C, E ], [ C, F ], [ E, F ] ]

jem-davies commented 5 months ago

https://github.com/benthosdev/benthos/issues/2599 - possible overlap with this issue

As part of the PR that proposes a solution to the issue 2599 - the way the DAG is represented in Benthos Config has been altered to :

          B:
            dependency_list: ["A"]
            processors:
              ...

So each Node in the DAG has a list of it's direct dependents.