Closed AllenFang closed 9 years ago
Can you check Yarn's log to see why the Application Master can't be launched, which seems to be the root cause of your exception?
HI @yzhou2001, the error message that I provided is already from yarn logs and first error message is
Application application_1439169262151_0037 failed 2 times due to AM Container for appattempt_1439169262151_0037_000002 exited with exitCode: 127 due to: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException: org.apache.hadoop.util.Shell$ExitCodeException: at org.apache.hadoop.util.Shell.runCommand(Shell.java:505) at org.apache.hadoop.util.Shell.run(Shell.java:418) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)
the message just like I post as above.
Allen,
You say hadoop-yarn v2.3.0 is included in your spark v1.4.0 shaded jar
What version of hadoop-yarn is included in the spark-sql-on-hbase-1.0.0.jar you removed from $SPARK_CLASSPATH ? Is it also v2.3.0?
And can you post your spark program (driver) to help us reproduce the problem?
Thanks, Stan
I use this command to package the jar, but I'm not sure is it correct?
mvn clean package -Phbase,hadoop-2.3 -DskipTests
So how to package the jar with a hadoop version?
And my driver program in the below
val tableName = "XXX"; val conf = new SparkConf().setAppName("HBase_Query_with_RDD"); val sc = new SparkContext(conf); val hbaseConf = HBaseConfiguration.create(); hbaseConf.set("hbase.zookeeper.quorum","server-a1") hbaseConf.set("hbase.zookeeper.property.clientPort","2181") hbaseConf.set("mapreduce.framework.name", "yarn") hbaseConf.set("yarn.resourcemanager.address", "server-a1:8032") hbaseConf.set("yarn.resourcemanager.scheduler.address", "server-a1:8030") hbaseConf.set("yarn.resourcemanager.resource-tracker.address", "server-a1:8031"); hbaseConf.set("yarn.resourcemanager.admin.address", "server-a1:8033") hbaseConf.set(TableInputFormat.INPUT_TABLE, tableName) var table = new HTable(hbaseConf, tableName) val hbaseRDD = sc.newAPIHadoopRDD(hbaseConf, classOf[TableInputFormat], classOf[ImmutableBytesWritable], classOf[Result]) println("contain result: " + hbaseRDD.count()) table.close() sc.stop()
And I give the yarn message for more detail, hope it will be helpful for you.
2015-08-13 10:25:28,434 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl: appattempt_1439169262151_0057_000002 State change from FINAL_SAVING to FAILED 2015-08-13 10:25:28,434 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Updating application application_1439169262151_0057 with final state: FAILED 2015-08-13 10:25:28,434 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1439169262151_0057 State change from ACCEPTED to FINAL_SAVING 2015-08-13 10:25:28,434 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler: Application appattempt_1439169262151_0057_000002 is done. finalState=FAILED 2015-08-13 10:25:28,435 INFO org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: Storing info for app: application_1439169262151_0057 2015-08-13 10:25:28,435 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo: Application application_1439169262151_0057 requests cleared 2015-08-13 10:25:28,435 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.LeafQueue: Application removed - appId: application_1439169262151_0057 user: user1 queue: default #user-pending-applications: 0 #user-active-applications: 0 #queue-pending-applications: 0 #queue-active-applications: 0 2015-08-13 10:25:28,435 INFO org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue: Application removed - appId: application_1439169262151_0057 user: user1 leaf-queue of parent: root #applications: 0 2015-08-13 10:25:28,435 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Application application_1439169262151_0057 failed 2 times due to AM Container for appattempt_1439169262151_0057_000002 exited with exitCode: 1 due to: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException: org.apache.hadoop.util.Shell$ExitCodeException: at org.apache.hadoop.util.Shell.runCommand(Shell.java:505) at org.apache.hadoop.util.Shell.run(Shell.java:418) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Container exited with a non-zero exit code 1 .Failing this attempt.. Failing the application. 2015-08-13 10:25:28,435 INFO org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: application_1439169262151_0057 State change from FINAL_SAVING to FAILED 2015-08-13 10:25:28,435 WARN org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=user1 OPERATION=Application Finished - Failed TARGET=RMAppManager RESULT=FAILURE DESCRIPTION=App failed with state: FAILED PERMISSIONS=Application application_1439169262151_0057 failed 2 times due to AM Container for appattempt_1439169262151_0057_000002 exited with exitCode: 1 due to: Exception from container-launch: org.apache.hadoop.util.Shell$ExitCodeException: org.apache.hadoop.util.Shell$ExitCodeException: at org.apache.hadoop.util.Shell.runCommand(Shell.java:505) at org.apache.hadoop.util.Shell.run(Shell.java:418) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:650) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:195) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:283) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:79) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)
Hi Allen,
I just wanted to make sure you did not have hadoop-yarn version conflicts. I don't think you do, and you packaged the jar correctly.
You do not need to run your Yarn container to create a NewHadoopRDD with Spark-SQL-on-HBase.
Here is a simple example that works in my environment.
1,xiaoming,16,id_1,teacherW 2,xiaoming,16,id_2,teacherW 3,xiaoming,16,id_3,teacherW 4,xiaoming,16,id_4,teacherW 5,xiaoming,16,id_5,teacherW 6,xiaoming,16,id_6,teacherW 7,xiaoming,16,id_7,teacherW 8,xiaoming,16,id_8,teacherW 9,xiaoming,16,id_9,teacherW 10,xiaoming,16,id_10,teacherW 11,xiaoming,16,id_11,teacherW 12,xiaoming,16,id_12,teacherW 13,xiaoming,16,id_13,teacherW 14,xiaoming,16,id_14,teacherW 15,xiaoming,16,id_15,teacherW 16,xiaoming,16,id_16,teacherW 17,xiaoming,16,id_17,teacherW 18,xiaoming,16,id_18,teacherW 19,xiaoming,16,id_19,teacherW 1001,lihua,20,A1000, 1002,lihua,20,A1000,
column-family='cf' rowkey:string columns:datatype -> a:string, b:string, c:string, d:string (col 'd' is nullable)
package org.apache.spark.sql.hbase
import org.apache.hadoop.hbase.client.{HTable, Result}
import org.apache.hadoop.hbase.io.ImmutableBytesWritable
import org.apache.hadoop.hbase.mapreduce.TableInputFormat
import org.apache.hadoop.hbase.util.Bytes
import org.apache.hadoop.hbase.{Cell, CellUtil}
import org.apache.spark.rdd.RDD
import org.apache.spark.{SparkConf, SparkContext}
object NewHadoopRDDExample {
def main(args: Array[String]) {
println("NewHadoopRDDExample")
val sparkHome = System.getenv("SPARK_HOME")
val tableName = "PEOPLE"
val sparkConf = new SparkConf(true)
.setMaster("local[2]")
.setAppName("NewHadoopRDDExample")
.set("spark.executor.memory", "1g")
val sc = new SparkContext(sparkConf)
val hbaseContext = new org.apache.spark.sql.hbase.HBaseSQLContext(sc)
val hbaseConf = hbaseContext.sparkContext.hadoopConfiguration
hbaseConf.set("fs.defaultFS", "hdfs://YOUR-NAMENODE:54310")
hbaseConf.set("hbase.zookeeper.quorum", "YOUR-ZK-CNXN-STRING")
hbaseConf.set(TableInputFormat.INPUT_TABLE, tableName)
var table = new HTable(hbaseConf, tableName)
val hbaseRDD = sc.newAPIHadoopRDD(
hbaseConf,
classOf[TableInputFormat],
classOf[ImmutableBytesWritable],
classOf[Result])
println("HBase RDD Count: " + hbaseRDD.count)
println("\nHBase KeyValues:")
hbaseRDD.foreach(println)
// Have to map ImmutableBytesWritables to serializable objects before running rdd.collect
val cellsRDD: RDD[(String, Array[String])] = hbaseRDD.map(x => x._2).map(result => {
val rowkey = result.getRow
val col1: Cell = result.getColumnLatestCell(Bytes.toBytes("cf"), Bytes.toBytes("a"))
val col2: Cell = result.getColumnLatestCell(Bytes.toBytes("cf"), Bytes.toBytes("b"))
val col3: Cell = result.getColumnLatestCell(Bytes.toBytes("cf"), Bytes.toBytes("c"))
val col4: Cell = result.getColumnLatestCell(Bytes.toBytes("cf"), Bytes.toBytes("d"))
val arr = new Array[String](4)
arr(0) = (Bytes.toStringBinary(CellUtil.cloneValue(col1)))
arr(1) = (Bytes.toStringBinary(CellUtil.cloneValue(col2)))
arr(2) = (Bytes.toStringBinary(CellUtil.cloneValue(col3)))
// col 'd' is nullable
arr(3) = if (col4 != null) (Bytes.toStringBinary(CellUtil.cloneValue(col4))) else null
(Bytes.toStringBinary(rowkey), arr)
})
println("\nDeserialized Rows:")
val tuples = cellsRDD.collect
for (i <- 0 until tuples.length) {
print("Row: " + tuples(i)._1)
print(" => " + tuples(i)._2.mkString(" | "))
println
}
table.close()
}
}
Sorry, I dont know what actually means about running your Yarn container to create a NewHadoopRDD with Spark-SQL-on-HBase.
Anyway if I just write a spark application to read a HDFS file and counting the rows, the error message is same.
Your driver's configuration assumes a running Yarn container: hbaseConf.set("mapreduce.framework.name", "yarn") hbaseConf.set("yarn.resourcemanager.address", "server-a1:8032") hbaseConf.set("yarn.resourcemanager.scheduler.address", "server-a1:8030") hbaseConf.set("yarn.resourcemanager.resource-tracker.address", "server-a1:8031"); hbaseConf.set("yarn.resourcemanager.admin.address", "server-a1:8033")
Can you run the driver example I posted, using Spark-SQL-on-HBase?
You will see no references to Yarn in the working example's configuration, and you should have no Yarn container start-up or connection errors because Spark-SQL-on-HBase will not try to submit a job to Yarn.
Hi @sparksburnitt, yeah your are right, if I use your sample with Spark-SQL-on-HBase, it's work. Thanks a lots. But just like I said before, a very simple spark application running on yarn with error cause by starting application master failed still exist. ;(
Hello Allen,
I was able to run your Spark (only) code with spark-sql-on-hbase-1.0.0.jar in the SPARK_CLASSPATH without a problem.
You do not need those yarn property settings in your HBase config object.
Why don't you try it again after replacing the yarn properties below with your 'fs.defaultFS' value. / hbaseConf.set("mapreduce.framework.name", "yarn") hbaseConf.set("yarn.resourcemanager.address", "server-a1:8032") hbaseConf.set("yarn.resourcemanager.scheduler.address", "server-a1:8030") hbaseConf.set("yarn.resourcemanager.resource-tracker.address", "server-a1:8031") hbaseConf.set("yarn.resourcemanager.admin.address", "server-a1:8033") /
// Spark needs to know where your hdfs root is: hbaseConf.set("fs.defaultFS", "hdfs://YOUR-NAMENODE:PORT")
Let me know if you still see yarn errors.
-Stan
Hi @sparksburnitt , I've already tried it but result is same. I forgot to talk you, very sorry.
Hi Allen,
I just ran the example with your yarn settings (adjusted for my environment) and still could not reproduce the errors.
How are you submitting the job?
(I have been running these examples from a scala object in an IDE.)
Hi Allen,
I ran the scala script below in the spark-shell (v1.4.0), and could not reproduce the errors.
Does it work on your cluster?
...
import org.apache.hadoop.hbase.HBaseConfiguration import org.apache.hadoop.hbase.client.{HTable, Result} import org.apache.hadoop.hbase.io.ImmutableBytesWritable import org.apache.hadoop.hbase.mapreduce.TableInputFormat import org.apache.spark.{SparkConf, SparkContext}
val tableName = "XXXX_TBL"
val hbaseConf = HBaseConfiguration.create hbaseConf.set("fs.defaultFS", "hdfs://SERVER:54310") hbaseConf.set("hbase.zookeeper.quorum", "SERVER:2181")
hbaseConf.set("mapreduce.framework.name", "yarn") hbaseConf.set("yarn.resourcemanager.address", "SERVER:8032") hbaseConf.set("yarn.resourcemanager.scheduler.address", "SERVER:8030") hbaseConf.set("yarn.resourcemanager.resource-tracker.address", "SERVER:8025") hbaseConf.set("yarn.resourcemanager.admin.address", "SERVER:8033")
hbaseConf.set(TableInputFormat.INPUT_TABLE, tableName)
val table = new HTable(hbaseConf, tableName)
val hbaseRDD = sc.newAPIHadoopRDD( hbaseConf, classOf[TableInputFormat], classOf[ImmutableBytesWritable], classOf[Result])
println("HBase RDD Count: " + hbaseRDD.count) table.close
Hi @sparksburnitt, I always run my application by spark-submit and use yarn-client
. But I do the following test inspired by you.
I've written a very simple code with scala and package a jar
... val input = sc.parallelize(Array(1,2,3,4,5)) println(input.count()) sc.stop()
yarn-client
. The result is fail, the problem is same.yarn-cluster
. The result is success.local
. The result is success.local
, it does not to use yarn to run this job, so always does not happen the error about the starting application masterAnyway, I also run the code that you provide above, the result is ok !! But I have no idea about why the application running only on yarn-client
will cause error. But I think I dont want to solve this problem temporary, because run on yarn-cluster
is work, so I can keep to work. However, thanks for your help, very sorry to spent you so much times.
Hello Allen,
I was able to run your job in yarn-client mode via spark-submit.
In the driver, I changed the spark-conf's master to 'yarn-client', e.g.
val sparkConf = new SparkConf(true) .setMaster("yarn-client") .setAppName("SparkOnYarnExample") .set("spark.executor.memory", "4g")
and built a 'fang.jar' containing the spark driver.
Here is the spark-submit command:
export SPARK_JAR=YR_PATH/spark-assembly-1.4.0-hadoop2.4.0.jar export SPARK_SQL_HBASE_JAR=YR_PATH/spark-sql-on-hbase-1.0.0.jar
bin/spark-submit --class org.apache.spark.sql.hbase.SparkOnYarnExample \ --master yarn-client \ --jars $SPARK_SQL_HBASE_JAR \ --num-executors 6 \ --driver-memory 4g \ --executor-memory 4g \ --executor-cores 6 \ /tmp/fang.jar
Hi, this is cool stuff for spark sql with HBase, however I've some issue or problem as follow:
I've installed your product follow by document and it's all work well currently. But I write a very simple Spark application for query HBase table using
newAPIHadoopRDD
but got these error:But If I remove the
spark-sql-on-hbase-1.0.0.jar
fromSPARK_CLASSPATH
, the job will pass.My spark version is
1.4.0
and Hadoop is2.3