GiraffaFS / giraffa

Giraffa FileSystem (Slack: giraffa-fs.slack.com)
https://giraffa.ci.cloudbees.com
Apache License 2.0
17 stars 6 forks source link

Support YARN #91

Closed shvachko closed 9 years ago

shvachko commented 9 years ago

Original issue 91 created by shvachko on 2014-10-11T00:18:57.000Z:

We need to create our own yarn-daemon.sh and yarn scripts that call the original ones with the proper Giraffa classpath and configuration directory. This will also require importing the yarn configurations into the Giraffa config directory. Then with Issue 90 completed we can (hopefully) run Yarn jobs on Giraffa.

shvachko commented 9 years ago

Comment #1 originally posted by shvachko on 2014-10-17T07:46:05.000Z:

Copied Yarn configs into Giraffa configuration directory and created three scripts: giraffa-env.sh for specifying Hadoop Home, HBase Home, and logging level yarn-giraffa-daemon.sh for launching yarn-daemon using Giraffa classpath and configs yarn-giraffa for launch yarn script using Giraffa classpath and configs

Also put default Yarn and MapReduce configurations into yarn-site.xml and mapred-site.xml so that user can run MapReduce jobs on Yarn right out of the box.

NOTE: Property "mapreduce.terasort.simplepartitioner" is set to "true" in mapred-site.xml so that user can run Terasort from the mapreduce examples jar. Otherwise, TotalOrderPartitioner is used which requires DistributedCache support, and DistributedCache support requires supporting URIs with fragments for making symlinks.

NOTE: Running Yarn jobs requires, in addition to this issue, the completion of Issue 90 and Issue 92.

shvachko commented 9 years ago

Comment #2 originally posted by shvachko on 2014-10-17T07:47:23.000Z:

Sorry, patch attached here. Also, I don't like the names yarn-giraffa and yarn-giraffa-daemon.sh but I couldn't think of anything better. Any suggestions?

shvachko commented 9 years ago

Comment #3 originally posted by shvachko on 2014-10-17T08:00:38.000Z:

<empty>

shvachko commented 9 years ago

Comment #4 originally posted by shvachko on 2014-10-17T08:16:39.000Z:

I can confirm that teragen, terasort, and teravalidate (from the hadoop-mapreduce-examples jar) all complete successfully on a standalone Giraffa cluster with Yarn, after applying the patches for Issue90, Issue91, and Issue92.

shvachko commented 9 years ago

Comment #5 originally posted by shvachko on 2014-10-31T20:44:17.000Z:

Patch rebased with trunk.

shvachko commented 9 years ago

Comment #6 originally posted by shvachko on 2014-10-31T21:48:28.000Z:

+1; lets get this in. As part of this issue would you also mind updating the documentation to point how to set-up YARN on Giraffa?

shvachko commented 9 years ago

Comment #7 originally posted by shvachko on 2014-10-31T22:32:40.000Z:

Committed to trunk. Documentation updated. See https://code.google.com/a/apache-extras.org/p/giraffa/wiki/HowTo#YARN_Setup.