Integrate Apache REEF into Peel

peelframework / peel

Peel is a framework that helps you to define, execute, analyze, and share experiments for distributed systems and algorithms.

http://peel-framework.org

Apache License 2.0

27 stars 32 forks source link

Integrate Apache REEF into Peel #76

Closed sdudoladov closed 8 years ago

sdudoladov commented 8 years ago

Apache REEF aims to be a standard library for developing distributed systems. Running experiments with REEF is on the critical path of at least two DIMA projects. So we may want to integrate it into Peel to ease experimenting.

aalexandrov commented 8 years ago

Whoever takes on on this issue can take a look at existing System bean implementations (e.g. Spark, Flink, or HDFS2) as a reference.

A commit which has more or less everything necessary to add a system is the one which adds support for Yarn.

I would be happy ti give further hints if needed.

carabolic commented 8 years ago

I'll try to create the initial prototype for REEF support.

@zerg-junior I might need some help identifying the necessary configuration parameters. Can you give me some pointers for where to find how to configure REEF for a cluster environment?

sdudoladov commented 8 years ago

pointers for where to find how to configure REEF for a cluster environment

REEF needs a resource manager such as YARN or Mesos to run in a cluster. Supporting just YARN should be enough for our purposes.

This tutorial describes how to use REEF with YARN ( look at slides 37-38 in the tutorial presentation). REEF provides HelloREEFYarn to test this installation.

aalexandrov commented 8 years ago

Peel already supports Yarn 2.4.1 (bean name yarn-2.4.1) and Yarn 2.7.1 (bean name yarn-2.7.1).

aalexandrov commented 8 years ago

@zerg-junior it seems that you just want to submit a Java job to Yarn which has the required REEF dependencies included in a fat jar. In this case you already can do it without further modifications in Peel.

aalexandrov commented 8 years ago

@zerg-junior Forgot what I said, based on the slides you point to the best way is to have a separate two separate beans.

A REEF class extending System (available via reef-0.13.0) that
- handles the download and installation of the REEF binaries, and
- depends on yarn-2.7.1.
A REEFExperiment class extending Experiment that
- submits a Yarn job as shown on Slide 38 (top), and
- potentially also collects logs as shown on Slide 38 (bottom).

AFAICT there is no need to configure any entries in the System#configuration method for the REEF system.

sdudoladov commented 8 years ago

@carabolic I'll look into this issue, but I'll need some help with Peel.

carabolic commented 8 years ago

I'll create the skeleton, that is extending System and Experiment.

sdudoladov commented 8 years ago

I'll create the skeleton, that is extending System and Experiment.

I am actually doing this right now. Let's meet early next week to split the work.

carabolic commented 8 years ago

here you go: I've created a branch 76-REEF, escpecially the last commit https://github.com/carabolic/peel/commit/c3b757bc4c4c9fab01f5309b1287de466d337248 .

aalexandrov commented 8 years ago

I've left some comments in the skeleton. I really thing that you need to implement only one method based on Slide 38 and can leave everything else empty. You can also check the LogCollection trait that can be mixed in your implementation in case you want to collect some REEF nodes.

verbit commented 8 years ago

I am currently working on the Yarn system in order to enable the collection of the user log files (currently only yarn-specific log files are collected and copied into the results folder). When I'm done, the results folder should also contain the user log files (as retrieved via yarn logs -applicationId <application ID>).

aalexandrov commented 8 years ago

@verbit I guess this assumes that you are submitting Yarn applications with

./yarn jar MAIN_CLASS args

Do you have an Experiment bean for that?

carabolic commented 8 years ago

OK. I've implemented the missing methods for REEFExperiment in my last commit as explained in the BOSS Tutorial on slide 38. The assumption is that the REEF job is somewhere in the Yarn classpath (yarn classpath), the apps/ directory in the Peel $BUNDLE_BIN, or the lib/ folder in in the Peel $BUNDLE_BIN.

But it turns out the new recommended way to start REEF on yarn would be to use yarn jar [REEF_JOB] [params]* as pointed out in their documentation. Therefor I think adding YarnExperiment to Peel would be a better way to run REEF experiments with Peel.

aalexandrov commented 8 years ago

@carabolic you can omit the lib folder, this is only related to Peel. Bug I agree, if we can handle this through a YarnExperiment I think the effort should go there. This implies that the apps folder will contain REEF applications as fat jars.

carabolic commented 8 years ago

With a working implementation for YarnExperiment for #79 I don't see any reason to leave this open or have a REEFExperiment at all.

It might be valuable to have a REEF system though, such that we could use the configuration to specify the resource manager (e.g. Yarn, Mesos) per experiment.

But for the simple use-case of running a REEF job on Yarn #79 is sufficient as shown in my example.

aalexandrov commented 8 years ago

@carabolic agreed. I don't think that the system will help a lot as parts of the configuration like the experiment bean will have to be RM-dependent anyway. I think we can close this as a wontfix as the required functionality will be basically provided by #79.