Closed milandesai closed 9 years ago
Posting the bugs I have found so far:
Will create a pull request momentarily, but still testing, there may be other issues.
Ls works, but mkdirs and other commands do not. Investigating.
Problem was that after moving hbase configurations to hbase-site.xml, GiraffaConfiguration needed to be updated to include HBase resources. That also necessitated moving GiraffaConfiguration to the hbase package. Now I am getting a NullPointerException on create because the leaseManager is null (false alarm, forgot to update the giraffa jar in hbase).
Client commands and MapReduce over Yarn work for me with the latest commit. This issue is ready for review. Please note that configuration.xsl, capacity-scheduler.xml, container-executer.cfg, yarn-env.sh, and mapred-sh were all copied exactly as is from Hadoop 2.5.1, so excuse the unnecessary changes and whitespace errors in these; I want to keep these files as exact copies so users can easily compare if they modify any values.
@zero45, @shvachko, @octo47 - if you could all check out the branch and make sure the standalone works for you, that would be great. Here are the instructions (after this is pushed, I'll also update the Wiki):
fs.defaultFS
to hdfs://localhost:9000 [etc/hadoop/core-site.xml]hadoop.tmp.dir
to a permanent location [etc/hadoop/core-site.xml]hbase.rootdir
to hdfs://localhost:9000/hbase [conf/hbase-site.xml]hbase.tmp.dir
to a permanent location [conf/hbase-site.xml]hbase.cluster.distributed
to true [conf/hbase-site.xml]JAVA_HOME
line in conf/hbase-env.sh and set the variable to /usr/libexec/java_home
. This must be done even if you already have a global environment for JAVA_HOME configured.bin/giraffa namenode -format
bin/start-giraffa.sh
. You should have the following processes running: NameNode, DataNode, HQuorumPeer, HMaster, HRegionServer.bin/giraffa format
.bin/giraffa fs …
bin/yarn-giraffa-daemon.sh start resourcemanager
bin/yarn-giraffa-daemon.sh start nodemanager
. You should have the following processes running: NameNode, DataNode, HQuorumPeer, HMaster, HRegionServer, ResourceManager, NodeManager.bin/yarn-daemon jar …
bin/yarn-giraffa jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.1.jar teragen 10000000 /teragen
bin/yarn-giraffa jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.1.jar terasort /teragen /terasort
bin/yarn-giraffa jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.1.jar teravalidate /terasort /teravalidate
@shvachko, the pull request here is hard to decipher because it fixes multiple different issues that should probably have their own Jiras, and there are also some style issues that were fixed. So I'm proposing that we keep this Jira an investigative task and open four new issues:
Those four will make the standalone build work. Then later we can commit some "cleanup" issues:
Milan this is great progress! I like the patch and especially that it makes terasor pass:
allowed.system.users
? It is empty anyways.yarn-env.sh
. We should remove those unless it is an exact copy of something from yarn.Thanks Konstantin. The files capacity-scheduler.xml
, configuration.xsl
, container-executer.cfg
, mapred-env.sh
, and yarn-env.sh
were copied exactly as is from Hadoop 2.5.1 configurations directory. That way were are consistent and any changes the user makes to these files can easily be compared with their counterparts in Hadoop. There are some other whitespace or unnecessary changes that I can remove though.
Non-essential changes have been removed, except for the files listed above which remain exact copies.
WooHoo
Committed to trunk as 868b7230fb74361a6c00e2c3265b06bf98d80b9f
We need to test both a standalone install and a distributed install; this issue tracks the former. I have hit some configuration and HBase related problems while attempting to format and run Giraffa; will post them and their fixes soon.