Open xunxky opened 3 years ago
Hi, so a little feed back. I have implemented this (at least for our needs, incompatible with the bonobo "library") in sync and async. It only is a straight forward chain without any branching (but it actually could be nested). The more I look into this the more I believe the basic approach of assuming some "graph" is too academic.
please have a look at gstreamer where they are using sources and sinks to redirect data flow. In some instances (e.g. Grouping, Counting ... ) the sinks need to know when the last element has been sent so their adjacent source can emit the computed result.
Thanks for the generator based approach of bonobo. However, we hit several limitations with bonobo, most of them I could circumvent them. But the most recent seems to me like an important feature i have not seen in any ETL implemented natively yet.
digraph G { subgraph cluster { node [style=filled]; "add date" -> "add id"; label = "loop until split yields EOD"; color=blue } "document split" -> "add date" "add id" -> "document unsplit" }