Closed sdudoladov closed 8 years ago
Whoever takes on on this issue can take a look at existing System
bean implementations (e.g. Spark
, Flink
, or HDFS2
) as a reference.
A commit which has more or less everything necessary to add a system is the one which adds support for Yarn
.
I would be happy ti give further hints if needed.
I'll try to create the initial prototype for REEF support.
@zerg-junior I might need some help identifying the necessary configuration parameters. Can you give me some pointers for where to find how to configure REEF for a cluster environment?
pointers for where to find how to configure REEF for a cluster environment
REEF needs a resource manager such as YARN or Mesos to run in a cluster. Supporting just YARN should be enough for our purposes.
This tutorial describes how to use REEF with YARN ( look at slides 37-38 in the tutorial presentation). REEF provides HelloREEFYarn to test this installation.
Peel already supports Yarn 2.4.1 (bean name yarn-2.4.1
) and Yarn 2.7.1 (bean name yarn-2.7.1
).
@zerg-junior it seems that you just want to submit a Java job to Yarn which has the required REEF dependencies included in a fat jar. In this case you already can do it without further modifications in Peel.
@zerg-junior Forgot what I said, based on the slides you point to the best way is to have a separate two separate beans.
REEF
class extending System
(available via reef-0.13.0
) that
yarn-2.7.1
.REEFExperiment
class extending Experiment
that
AFAICT there is no need to configure any entries in the System#configuration
method for the REEF
system.
@carabolic I'll look into this issue, but I'll need some help with Peel.
I'll create the skeleton, that is extending System
and Experiment
.
I'll create the skeleton, that is extending System and Experiment.
I am actually doing this right now. Let's meet early next week to split the work.
here you go: I've created a branch 76-REEF, escpecially the last commit https://github.com/carabolic/peel/commit/c3b757bc4c4c9fab01f5309b1287de466d337248 .
I've left some comments in the skeleton. I really thing that you need to implement only one method based on Slide 38 and can leave everything else empty. You can also check the LogCollection trait that can be mixed in your implementation in case you want to collect some REEF nodes.
I am currently working on the Yarn
system in order to enable the collection of the user log files (currently only yarn-specific log files are collected and copied into the results folder).
When I'm done, the results folder should also contain the user log files (as retrieved via yarn logs -applicationId <application ID>
).
@verbit I guess this assumes that you are submitting Yarn applications with
./yarn jar MAIN_CLASS args
Do you have an Experiment
bean for that?
OK. I've implemented the missing methods for REEFExperiment
in my last commit as explained in the BOSS Tutorial on slide 38. The assumption is that the REEF job is somewhere in the Yarn classpath (yarn classpath
), the apps/
directory in the Peel $BUNDLE_BIN
, or the lib/
folder in in the Peel $BUNDLE_BIN
.
But it turns out the new recommended way to start REEF on yarn would be to use yarn jar [REEF_JOB] [params]*
as pointed out in their documentation. Therefor I think adding YarnExperiment
to Peel would be a better way to run REEF
experiments with Peel.
@carabolic you can omit the lib
folder, this is only related to Peel. Bug I agree, if we can handle this through a YarnExperiment
I think the effort should go there. This implies that the apps
folder will contain REEF applications as fat jars.
With a working implementation for YarnExperiment
for #79 I don't see any reason to leave this open or have a REEFExperiment
at all.
It might be valuable to have a REEF
system though, such that we could use the configuration to specify the resource manager (e.g. Yarn, Mesos) per experiment.
But for the simple use-case of running a REEF job on Yarn #79 is sufficient as shown in my example.
@carabolic agreed. I don't think that the system will help a lot as parts of the configuration like the experiment bean will have to be RM-dependent anyway. I think we can close this as a wontfix as the required functionality will be basically provided by #79.
Apache REEF aims to be a standard library for developing distributed systems. Running experiments with REEF is on the critical path of at least two DIMA projects. So we may want to integrate it into Peel to ease experimenting.