Closed kering-wang closed 11 years ago
Hello kering,
For using splout with CDH4 you need to use the MR2 distribution as there is an inherent incompatibility between MR1 and MR2 distributions. Did you try with the MR2 distribution? In the Maven central like there are two files: mr1.tar.gz and mr2.tar.gz http://search.maven.org/#search%7Cga%7C1%7Ca%3A%22splout-distribution%22
Thanks for you answer my question , I am true that i had download mr2.tar.gz and use it . Now, I have no way to solve the problem ,plz help me thanks guys .
i doubt that exception is about $HADOOP_HOME, in my hadoop env, there is no hadoop-common, only hadoop-hdfs hadoop-0.20-mapreduce,but if i use hadoop-0.20-mapreduce, dnode can not start , so i am so puzzle ,plz help me guys ,thanks ,wait for your commment to slove it
My hadoop version is Hadoop 2.0.0-cdh4.3.0 ,in my hadoop-path ,there are three folder about hadoop ,with
hadoop/ hadoop-0.20-mapreduce/ hadoop-hdfs/ hadoop-mapreduce/ hadoop-yarn/ \
my hive version is hive-0.10.0-cdh4.3.0, and splout version is splout-hadoop-0.2.5-hadoop with mr2 i.e.splout-distribution-0.2.5-mr2.tar.gz so
export HADOOP_COMMON_HOME=/usr/lib/hadoop export HADOOP_HDFS_HOME=/usr/lib/hadoop export HADOOP_MAPRED_HOME=/usr/lib/hadoop
or
export HADOOP_COMMON_HOME=/usr/lib/hadoop export HADOOP_HDFS_HOME=/usr/lib/hadoop-hdfs export HADOOP_MAPRED_HOME=/usr/lib/hadoop-0.20-mapreduce
use the Frist way, i can start all qnode and dnode, but intergate with hive ,there is exception with : Exception in thread "main" java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected at org.apache.hcatalog.mapreduce.HCatBaseInputFormat.getSplits(HCatBaseInputFormat.java:101) at com.datasalt.pangool.tuplemr.mapred.lib.input.HCatTupleInputFormat.getSplits(HCatTupleInputFormat.java:167) at com.splout.db.hadoop.SchemaSampler.sample(SchemaSampler.java:60) at com.splout.db.hadoop.TableBuilder.addCustomInputFormatFile(TableBuilder.java:237) at com.splout.db.hadoop.TableBuilder.addHiveTable(TableBuilder.java:199) at com.splout.db.hadoop.TableBuilder.addHiveTable(TableBuilder.java:185) at com.splout.db.hadoop.SimpleGeneratorCMD.run(SimpleGeneratorCMD.java:237) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at com.splout.db.hadoop.SimpleGeneratorCMD.main(SimpleGeneratorCMD.java:261) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.datasalt.pangool.PangoolDriver$ProgramDescription.invoke(PangoolDriver.java:55) at com.datasalt.pangool.PangoolDriver.driver(PangoolDriver.java:128) at com.splout.db.hadoop.Driver.main(Driver.java:49) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:208)
if i use the second way, i can not start dnode .
so anyone help me ,plz thanks
WHAT IS WRONG? GUYS .THANKS
Hi kering,
I'm going to have a look to your problem. Which version of MapReduce are you using with CDH4, MRv1 or MRv2?
Thanks, Iván
2013/9/5 kering-wang notifications@github.com
[image: screenshot from 2013-09-05 16 49 22]https://f.cloud.github.com/assets/5236356/1086855/a699b034-1608-11e3-84cc-d6a462194b20.PNG
WHAT IS WRONG? GUYS .THANKS
— Reply to this email directly or view it on GitHubhttps://github.com/datasalt/splout-db/issues/24#issuecomment-23852501 .
Iván de Prado CEO & Co-founder www.datasalt.com
Hi Kering,
It is much better if following questions are send to the users mailing list (https://groups.google.com/forum/#!forum/sploutdb-users). In that way, we can share the answer with everyone.
I have replicated your environment and I have a solution for you. I will describe you the steps. The problem are two:
1) Problems to define properly the classpath 2) HCatalog library not compiled for MRv2
We will have to work in a patch for that, but meanwhile the problems can be solved with the following steps:
1) Changing the script splout-service.sh
Attached you have patched version of the script. Replace the old one by this one. Use the following environmental variables:
export HADOOP_COMMON_HOME=/usr/lib/hadoop export HADOOP_HDFS_HOME=/usr/lib/hadoop-hdfs export HADOOP_MAPRED_HOME=/usr/lib/hadoop-0.20-mapreduce
2) Including Hive in the classpath.
In order to make possible to hcatalog to access to the hive data, I had to add Hive to the general hadoop classpath. I did that by adding the following line to the start of the file /usr/lib/hadoop/libexec/hadoop-config.sh
HADOOP_CLASSPATH=/etc/hive/conf:/usr/lib/hive/lib
Now, if you execute the command "hadoop classpath" you should see something like:
/etc/hadoop/conf:/usr/lib/hadoop/lib/:/usr/lib/hadoop/.//:/etc/hive/conf:/usr/lib/hive/lib:/usr/lib/hadoop-hdfs/./:/usr/lib/hadoop-hdfs/lib/:/usr/lib/hadoop-hdfs/.//:/usr/lib/hadoop-yarn/lib/:/usr/lib/hadoop-yarn/.//:/usr/lib/hadoop-0.20-mapreduce/./:/usr/lib/hadoop-0.20-mapreduce/lib/:/usr/lib/hadoop-0.20-mapreduce/.//
3) Replace the hcatalog jar from the splout-hadoop-0.2.5-hadoop-mr2.jar jar.
The jar file splout-hadoop-0.2.5-hadoop-mr2.jar contains internally a HCatalog version that was not compiled with the new MRv2 version. The way of solving it is replacing it with the version from the Cloudera distribution. You can find it at /usr/lib/hcatalog/share/hcatalog/hcatalog-core-0.5.0-cdh4.3.1.jar. If you don't find it, you will have to install the package "hcatalog" first.
Follow this steps from the Splout home to replace the file:
mkdir 1 cd 1 unzip ../splout-hadoop-0.2.5-hadoop-mr2.jar rm lib/hcatalog-core-0.5.0-incubating.jar cp /usr/lib/hcatalog/share/hcatalog/hcatalog-core-0.5.0-cdh4.3.1.jar lib zip -r splout-tunned.jar * mv splout-tunned.jar .. cd ..
And now you can use splout-tunned.jar instead of splout-hadoop-0.2.5-hadoop-mr2.jar for running your job.
Try it and tell us if it worked.
Regards, Iván
2013/9/5 Iván de Prado ivan@datasalt.com
Hi kering,
I'm going to have a look to your problem. Which version of MapReduce are you using with CDH4, MRv1 or MRv2?
Thanks, Iván
2013/9/5 kering-wang notifications@github.com
[image: screenshot from 2013-09-05 16 49 22]https://f.cloud.github.com/assets/5236356/1086855/a699b034-1608-11e3-84cc-d6a462194b20.PNG
WHAT IS WRONG? GUYS .THANKS
— Reply to this email directly or view it on GitHubhttps://github.com/datasalt/splout-db/issues/24#issuecomment-23852501 .
Iván de Prado CEO & Co-founder www.datasalt.com
Iván de Prado CEO & Co-founder www.datasalt.com
Sorry, I attached the wrong script. Use the attached instead.
Iván
2013/9/5 Iván de Prado ivan@datasalt.com
Hi Kering,
It is much better if following questions are send to the users mailing list (https://groups.google.com/forum/#!forum/sploutdb-users). In that way, we can share the answer with everyone.
I have replicated your environment and I have a solution for you. I will describe you the steps. The problem are two:
1) Problems to define properly the classpath 2) HCatalog library not compiled for MRv2
We will have to work in a patch for that, but meanwhile the problems can be solved with the following steps:
1) Changing the script splout-service.sh
Attached you have patched version of the script. Replace the old one by this one. Use the following environmental variables:
export HADOOP_COMMON_HOME=/usr/lib/hadoop export HADOOP_HDFS_HOME=/usr/lib/hadoop-hdfs export HADOOP_MAPRED_HOME=/usr/lib/hadoop-0.20-mapreduce
2) Including Hive in the classpath.
In order to make possible to hcatalog to access to the hive data, I had to add Hive to the general hadoop classpath. I did that by adding the following line to the start of the file /usr/lib/hadoop/libexec/hadoop-config.sh
HADOOP_CLASSPATH=/etc/hive/conf:/usr/lib/hive/lib
Now, if you execute the command "hadoop classpath" you should see something like:
/etc/hadoop/conf:/usr/lib/hadoop/lib/:/usr/lib/hadoop/.//:/etc/hive/conf:/usr/lib/hive/lib:/usr/lib/hadoop-hdfs/./:/usr/lib/hadoop-hdfs/lib/:/usr/lib/hadoop-hdfs/.//:/usr/lib/hadoop-yarn/lib/:/usr/lib/hadoop-yarn/.//:/usr/lib/hadoop-0.20-mapreduce/./:/usr/lib/hadoop-0.20-mapreduce/lib/:/usr/lib/hadoop-0.20-mapreduce/.//
3) Replace the hcatalog jar from the splout-hadoop-0.2.5-hadoop-mr2.jar jar.
The jar file splout-hadoop-0.2.5-hadoop-mr2.jar contains internally a HCatalog version that was not compiled with the new MRv2 version. The way of solving it is replacing it with the version from the Cloudera distribution. You can find it at /usr/lib/hcatalog/share/hcatalog/hcatalog-core-0.5.0-cdh4.3.1.jar. If you don't find it, you will have to install the package "hcatalog" first.
Follow this steps from the Splout home to replace the file:
mkdir 1 cd 1 unzip ../splout-hadoop-0.2.5-hadoop-mr2.jar rm lib/hcatalog-core-0.5.0-incubating.jar cp /usr/lib/hcatalog/share/hcatalog/hcatalog-core-0.5.0-cdh4.3.1.jar lib zip -r splout-tunned.jar * mv splout-tunned.jar .. cd ..
And now you can use splout-tunned.jar instead of splout-hadoop-0.2.5-hadoop-mr2.jar for running your job.
Try it and tell us if it worked.
Regards, Iván
2013/9/5 Iván de Prado ivan@datasalt.com
Hi kering,
I'm going to have a look to your problem. Which version of MapReduce are you using with CDH4, MRv1 or MRv2?
Thanks, Iván
2013/9/5 kering-wang notifications@github.com
[image: screenshot from 2013-09-05 16 49 22]https://f.cloud.github.com/assets/5236356/1086855/a699b034-1608-11e3-84cc-d6a462194b20.PNG
WHAT IS WRONG? GUYS .THANKS
— Reply to this email directly or view it on GitHubhttps://github.com/datasalt/splout-db/issues/24#issuecomment-23852501 .
Iván de Prado CEO & Co-founder www.datasalt.com
Iván de Prado CEO & Co-founder www.datasalt.com
Iván de Prado CEO & Co-founder www.datasalt.com
thanks for the ivan‘s way,through that i had sloved the problem what i had met .
My hadoop version is Hadoop 2.0.0-cdh4.3.0 and splout version is splout-distribution-0.2.5. When i integration hive with splout , occur a exception what is the" Exception in thread "main" java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.JobContext, but class was expected "
anyone guys help me ?