cjuexuan / mynote

237 stars 34 forks source link

spark on yarn如何获得applcationID #35

Open cjuexuan opened 7 years ago

cjuexuan commented 7 years ago

主题

如题,最近一直在做spark的监控系统,刚好有个朋友问我如何获取yarn的applcationId,由于hock需要挂在spark启动之前,所以通过spark conf,然后走spark.app.id是肯定拿不到的,这里给出我的实现方法

package org.apache.spark.util

import org.apache.spark.deploy.yarn.ApplicationMaster

/**
  * @author cjuexuan at 02/05/2017 18:20.
  *         email : cjuexuan@gmail.com 
  */
object SparkArgsUtils {

  private val args = System.getProperty("sun.java.command").
    split(" ").tail.grouped(2).map(kv ⇒ (kv.head, kv(1))).toMap

  //sparkEnv in driver create by SparkEnv.createDriverEnv and in executor ,create by SparkEnv.createExecutorEnv
  //and when we init spark's metricsSystem,executorId may not set
  // so we should get it  from args
  def getExecutorId: String = {
    args.getOrElse("--executor-id", "driver")
  }

  def getApplicationId: String = {
    args.getOrElse("--app-id", ApplicationMaster.getAttemptId().getApplicationId.toString)
  }

  def getSparkAppDomain(appName: String): String = {
    s"$appName-$getApplicationId"
  }
}

其实要分两种情况,executor中获取目前我只想到了走jvm的参数解析,至于dirver的,可以通过ApplcationMaster.getAttemptId

未完待续...

cjuexuan commented 7 years ago

这种情况下有个问题是yarn-client执行这个逻辑会有空指针异常出现

xiashuijun commented 7 years ago

YarnClient可以帮助你解决

cjuexuan commented 7 years ago

ApplicationMaster最终调用的代码是YarnSparkHadoopUtil.get.getContainerId.getApplicationAttemptId()