swsnu / swppspr2015

Repository for discussing common issues that are not project-specific, SNU SWPP Spring 2015
https://sites.google.com/site/snuswppspr2015/
0 stars 0 forks source link

REEF Troubleshooting: $HADOOP_HOME #13

Open jsjason opened 9 years ago

jsjason commented 9 years ago

Issue: YARN is running, but when a REEF job is submitted the following error appears:

2014-07-14 11:17:40,940 FINE hadoop.util.Shell.checkHadoopHome main | Failed to detect a valid hadoop home directory java.io.IOException: HADOOP_HOME or hadoop.home.dir are not set.
    at org.apache.hadoop.util.Shell.checkHadoopHome(Shell.java:225)
[... stack trace omitted ...]

Solution: This issue appears because REEF needs to read the environment variable $HADOOP_HOME within an SSH session. You can run the following to check if it is set:

$ ssh localhost 'echo $HADOOP_HOME'

(Notice the single quotes. Don’t use double quotes, as this will expand $HADOOP_HOME in the local session, which is not what we want.) If running the command does not print your Hadoop home directory then you’ll need to do some extra configuration for SSH sessions.

Each OS distribution handles this in its own way: Mac OS X will read from ~/.bashrc, while Ubuntu will read from ~/.ssh/environment. Make the necessary changes for your OS (we cover Ubuntu below) and make sure the above command works. Then, make sure to restart YARN. This is necessary because REEF reads the Hadoop Home from YARN, which caches reads the environment variable once, at startup.

Extended Solution: Here we cover the setup for Ubuntu, which caused the most headaches for students.

Ubuntu users need to add the line

PermitUserEnvironment yes

to the file (PermitUserEnvironment is not set by default), to let user-defined environment variables be set correctly. The file is a read-only file, so you may need to add sudo in front of your text editor command to earn the permission to modify the file. E.g.

$ sudo vi /etc/ssh/sshd_config

It doesn't matter which position you add the line; you can just add it to the end of the file.

JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-amd64
HADOOP_HOME=/home/username/Projects/hadoop-2.4.0
YARN_HOME=/home/username/Projects/hadoop-2.4.0
REEF_HOME=/home/username/Projects/REEF
$ sudo service ssh restart

Finally, don’t forget to restart YARN!

jsjason commented 9 years ago

written by @dafrista and me.