Closed hemenkapadia closed 9 years ago
Completed Hadoop configuration in submit 9e301253a6ae3b8ac34856abdd65d921fe28476e.
Some aspects to be taken care of.
There is a possible error in the way I am setting the configuration in the above commit. Try with the conf directory at the /opt/hadoop-version/conf level as compared to /opt/hadoop-version/etc/hadoop/conf.
Based on reading this shell script, as getting constant error with start-dfs.sh
hadoop@master:/opt/hadoop-2.6.1/libexec$ cat hadoop-config.sh
on issuing hdfs namenode - format in the flurry of output we have this line
15/10/20 23:18:27 INFO common.Storage: Storage directory /tmp/hadoop-hadoop/dfs/name has been successfully formatted.
which seems to indicate that our more configuration is needed
start-dfs.sh bring up HDFS properly.
http://192.168.48.10:50070/dfshealth.html#tab-overview is a place to check it...
Yarn is not coming up due to the following error
2015-10-21 11:45:06,053 INFO [main] resourcemanager.RMNMInfo (RMNMInfo.java:<init>(63)) - Registered RMNMInfo MBean
2015-10-21 11:45:06,057 INFO [main] util.HostsFileReader (HostsFileReader.java:refresh(129)) - Refreshing hosts (include/exclude)
list
2015-10-21 11:45:06,063 INFO [main] conf.Configuration (Configuration.java:getConfResourceAsInputStream(2311)) - capacity-schedul
er.xml not found
2015-10-21 11:45:06,316 INFO [main] service.AbstractService (AbstractService.java:noteFailure(272)) - Service org.apache.hadoop.y
arn.server.resourcemanager.scheduler.capacity.CapacityScheduler failed in state INITED; cause: java.lang.IllegalStateException: Qu
eue configuration missing child queue names for root
java.lang.IllegalStateException: Queue configuration missing child queue names for root
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:56
0)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.j
ava:465)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java
:297)
at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:3
26)
Possible solution is to have the capacity-scheduler.xml file in the conf directory. Currently moving manually from etc/hadoop to conf
Cool that did the trick. A quick way to check if all daemons are up and running
hadoop@master:/opt/hadoop-2.6.1/sbin$ jps | grep -v Jps
5231 ResourceManager
4479 SecondaryNameNode
4291 NameNode
hadoop@master:/opt/hadoop-2.6.1/sbin$ ssh hadoop1.local jps | grep -v Jps
2189 DataNode
2709 NodeManager
hadoop@master:/opt/hadoop-2.6.1/sbin$ ssh hadoop2.local jps | grep -v Jps
2581 NodeManager
2194 DataNode
hadoop@master:/opt/hadoop-2.6.1/sbin$ ssh hadoop3.local jps | grep -v Jps
2589 NodeManager
2181 DataNode
hadoop@master:/opt/hadoop-2.6.1/sbin$
Alternatively browse to the below links HDFS - http://192.168.48.10:50070/dfshealth.html#tab-overview YARN - http://192.168.48.10:8088/cluster/cluster
Now to run a map reduce job, gave this command.
yarn jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar pi 32 10000
but encountered the following error
Container launch failed for container_1445454060656_0002_01_000004 : org.apache.hadoop.yarn.exceptions.InvalidAuxServiceException:
The auxService:mapreduce_shuffle does not exist
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
The below post may have the answer https://dataheads.wordpress.com/2013/11/21/hadoop-2-setup-on-64-bit-ubuntu-12-04-part-1/
Hit this really silly issue http://unix.stackexchange.com/questions/237823/why-is-my-shell-variable-concatenation-not-working-on-ubuntu-bash.
Need to use dos2unix each time before I provision. Also find out if brackets is messing up the line endings. If not, then definitely git - so fix its autocrlf setting...
By changing the HDFS storage location to be under /home/hadoop instead of /tmp, the issue of corrupt HDFS post restart of the cluster seems to have resolved.
Get the PATH fixed and we should be good to close this issue.
Install hadoop on the cluster and test the same out with a complete map reduce job.
We will install the 2.6.1 version as it should be a more stable point release as compared to 2.7.1 which is the first stable 2.7.X release.