Pipeline Forking - Githubissues

massadop commented 3 years ago

Hi Joel, Thank you very much for the fix. TDP works fine now. I would be very interested in "Forking (parallel task)`metioned at the end of the TODO list. I am not completely sure if this 'forking' means to install a new branch on the pipeline, so one input can pass the data to more than one stages or pipes and at the end to have different outputs one per each branch. Brgds

JoelFilho commented 3 years ago

Hello, the idea of forking is exactly that: feeding the output of a pipeline into multiple others.

However, this kind of operation incurs some complications:

The DSL becomes really complicated, especially when joining back
How do we synchronize the input into multiple pipelines? (Which brings some overhead into the library, if we need to release them simultaneously)
Putting parallel pipelines inside a pipeline? That's not a pipeline anymore, it's a flow graph! There are more complete tools that deal with such a thing, like Intel TBB. Notice how much more complex it is to define a graph there, when compared to a pipeline.

So, the things listed as "Possible features" on the TODO are just ideas of future features, after the 1.0 release of TDP, and there are still no plans for those features.

If you need to fork pipelines without needing to rejoin them, and still want to use TDP, it's fairly easy to compose a series of pipelines with your own dispatcher, adding it as a consumer to the main pipeline, e.g.:

auto square_pipe = tdp::input<int> >> square >> tdp::consumer{print};
auto increment_pipe = tdp::input<int> >> increment >> tdp::consumer{print};

auto forker = [&](int in){
    square_pipe.input(in);
    increment_pipe.input(in);
};

auto my_pipe = tdp::input<int, int> >> add >> tdp::consumer{forker};

(Notice the structures in TDP are single-producer, single-consumer, so you cannot use square_pipe.input() and increment_pipe.input() in their scope, in this example. You would instead make a self-contained class or use the ownership wrappers, to prevent such kind of problem)

massadop commented 3 years ago

Hi,

Thank you very much for the details. I shall have a look to your proposal. Brgds El 9 jun 2021 20:10 +0200, Joel Filho @.***>, escribió:

Hello, the idea of forking is exactly that: feeding the output of a pipeline into multiple others. However, this kind of operation incurs some complications:

• The DSL becomes really complicated, especially when joining back • How do we synchronize the input into multiple pipelines? (Which brings some overhead into the library, if we need to release them simultaneously) • Putting parallel pipelines inside a pipeline? That's not a pipeline anymore, it's a flow graph! There are more complete tools that deal with such a thing, like Intel TBB. Notice how much more complex it is to define a graph there, when compared to a pipeline.

So, the things listed as "Possible features" on the TODO are just ideas of future features, after the 1.0 release of TDP, and there are still no plans for those features. If you need to fork pipelines without needing to rejoin them, and still want to use TDP, it's fairly easy to compose a series of pipelines with your own dispatcher, adding it as a consumer to the main pipeline, e.g.: auto square_pipe = tdp::input >> square >> tdp::consumer{print}; auto increment_pipe = tdp::input >> increment >> tdp::consumer{print};

auto forker = [&](int in){ square_pipe.input(in); increment_pipe.input(in); };

auto my_pipe = tdp::input<int, int> >> add >> tdp::consumer{forker}; (Notice the structures in TDP are single-producer, single-consumer, so you cannot use square_pipe.input() and increment_pipe.input() in their scope, in this example. You would instead make a self-contained class or use the ownership wrappers, to prevent such kind of problem) — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

JoelFilho / TDP

Pipeline Forking #12