seznam / euphoria

Euphoria is an open source Java API for creating unified big-data processing flows. It provides an engine independent programming model which can express both batch and stream transformations.
Apache License 2.0
82 stars 11 forks source link

euphoria-core: multi-executor execution #44

Open xitep opened 7 years ago

xitep commented 7 years ago

It would be desirable to run a program consisting of multiple flows on different executors. Example:

PExecutor sparkExec = new SparkExecutor(mem, cores);
PExecutor inmemExec = new InMemExecutor(mem, cores);

Flow flow1 = Flow.create("spark-flow");
Flow flow2 = Flow.create("inmem-flow");

sparkExec.registerFlow(flow1);
inmemExec.registerFlow(flow2);

// allocate YARN containers for Spark runtime and execute "flow1"
sparkExec.execute();

// allocate another YARN container for in-mem runtime
inmemExec.execute();

// kill all YARN containers
sparkExec.shutdown();
inmemExec.shutdown();

The idea is that euphoria executors be independent of the execution engine launchers and be able to allocate their own YARN containers from within which each would operate their own execution engine.

je-ik commented 7 years ago

Do we want to incorporate a fixed dependency on YARN? I think that whether the flow runs on YARN or any other system is dependent on the settings of the executor - i.e. you can run euphoria flow on flink or spark without YARN, if you configure the executor to run in standalone mode. It would of course be desirable to be able to run multiple flows with multiple executors from single driver code.