hemenkapadia / vagrant-hadoop-cluster

Deploying hadoop in a virtualized cluster in simple steps
0 stars 0 forks source link

Install Apache Hadoop on the cluster. #7

Closed hemenkapadia closed 9 years ago

hemenkapadia commented 9 years ago

Install hadoop on the cluster and test the same out with a complete map reduce job.

We will install the 2.6.1 version as it should be a more stable point release as compared to 2.7.1 which is the first stable 2.7.X release.

hemenkapadia commented 9 years ago

Completed Hadoop configuration in submit 9e301253a6ae3b8ac34856abdd65d921fe28476e.

Some aspects to be taken care of.

  1. ensure that the bin and sbin directories of hadoop are in the path for all users.
  2. the method by which we are setting JAVA_HOME (the profile manner) is not working. This is because the profile files come into play only for interactive bash logons, but do not seem to work when hadoop is communicating internally. The solution is to use bashrc instead
hemenkapadia commented 9 years ago

There is a possible error in the way I am setting the configuration in the above commit. Try with the conf directory at the /opt/hadoop-version/conf level as compared to /opt/hadoop-version/etc/hadoop/conf.

Based on reading this shell script, as getting constant error with start-dfs.sh

hadoop@master:/opt/hadoop-2.6.1/libexec$ cat hadoop-config.sh

hemenkapadia commented 9 years ago

on issuing hdfs namenode - format in the flurry of output we have this line

15/10/20 23:18:27 INFO common.Storage: Storage directory /tmp/hadoop-hadoop/dfs/name has been successfully formatted.

which seems to indicate that our more configuration is needed

hemenkapadia commented 9 years ago

start-dfs.sh bring up HDFS properly.

http://192.168.48.10:50070/dfshealth.html#tab-overview is a place to check it...

hemenkapadia commented 9 years ago

Yarn is not coming up due to the following error

2015-10-21 11:45:06,053 INFO  [main] resourcemanager.RMNMInfo (RMNMInfo.java:<init>(63)) - Registered RMNMInfo MBean
2015-10-21 11:45:06,057 INFO  [main] util.HostsFileReader (HostsFileReader.java:refresh(129)) - Refreshing hosts (include/exclude)
 list
2015-10-21 11:45:06,063 INFO  [main] conf.Configuration (Configuration.java:getConfResourceAsInputStream(2311)) - capacity-schedul
er.xml not found
2015-10-21 11:45:06,316 INFO  [main] service.AbstractService (AbstractService.java:noteFailure(272)) - Service org.apache.hadoop.y
arn.server.resourcemanager.scheduler.capacity.CapacityScheduler failed in state INITED; cause: java.lang.IllegalStateException: Qu
eue configuration missing child queue names for root
java.lang.IllegalStateException: Queue configuration missing child queue names for root
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:56
0)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.j
ava:465)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java
:297)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:3
26)

Possible solution is to have the capacity-scheduler.xml file in the conf directory. Currently moving manually from etc/hadoop to conf

hemenkapadia commented 9 years ago

Cool that did the trick. A quick way to check if all daemons are up and running

hadoop@master:/opt/hadoop-2.6.1/sbin$ jps | grep -v Jps
5231 ResourceManager
4479 SecondaryNameNode
4291 NameNode
hadoop@master:/opt/hadoop-2.6.1/sbin$ ssh hadoop1.local jps | grep -v Jps
2189 DataNode
2709 NodeManager
hadoop@master:/opt/hadoop-2.6.1/sbin$ ssh hadoop2.local jps | grep -v Jps
2581 NodeManager
2194 DataNode
hadoop@master:/opt/hadoop-2.6.1/sbin$ ssh hadoop3.local jps | grep -v Jps
2589 NodeManager
2181 DataNode
hadoop@master:/opt/hadoop-2.6.1/sbin$

Alternatively browse to the below links HDFS - http://192.168.48.10:50070/dfshealth.html#tab-overview YARN - http://192.168.48.10:8088/cluster/cluster

hemenkapadia commented 9 years ago

Now to run a map reduce job, gave this command.

yarn jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar pi 32 10000

but encountered the following error

Container launch failed for container_1445454060656_0002_01_000004 : org.apache.hadoop.yarn.exceptions.InvalidAuxServiceException:
 The auxService:mapreduce_shuffle does not exist
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:526)

The below post may have the answer https://dataheads.wordpress.com/2013/11/21/hadoop-2-setup-on-64-bit-ubuntu-12-04-part-1/

hemenkapadia commented 9 years ago

Hit this really silly issue http://unix.stackexchange.com/questions/237823/why-is-my-shell-variable-concatenation-not-working-on-ubuntu-bash.

Need to use dos2unix each time before I provision. Also find out if brackets is messing up the line endings. If not, then definitely git - so fix its autocrlf setting...

hemenkapadia commented 9 years ago

By changing the HDFS storage location to be under /home/hadoop instead of /tmp, the issue of corrupt HDFS post restart of the cluster seems to have resolved.

Get the PATH fixed and we should be good to close this issue.