heartbeat timeout causing all executors exit

duyanghao commented 7 years ago

I run a spark application with 100 executors(each has 40G and 4cores),after running 4 hours,all executors exit with 56 and report the following logs:

...
Exit as unable to send heartbeats to driver more than 60 times
...

and driver hangs for several hours with nothing to do.

the relevant logs of driver list below:

2017-07-10 21:11:40 WARN  Executor:87 - Issue communicating with driver in heartbeater
org.apache.spark.SparkException: Error sending message [message = Heartbeat(31,[Lscala.Tuple2;@641834e8,BlockManagerId(31, x.x.x.x, 57787, None))]
2017-07-10 21:11:49 WARN  Executor:87 - Issue communicating with driver in heartbeater
org.apache.spark.SparkException: Error sending message [message = Heartbeat(52,[Lscala.Tuple2;@2de28748,BlockManagerId(52, x.x.x.x, 20423, None))]
2017-07-10 21:11:42 WARN  Executor:87 - Issue communicating with driver in heartbeater
org.apache.spark.SparkException: Error sending message [message = Heartbeat(20,[Lscala.Tuple2;@431947f,BlockManagerId(20, x.x.x.x, 28993, None))]
2017-07-10 21:11:44 WARN  Executor:87 - Issue communicating with driver in heartbeater
org.apache.spark.SparkException: Error sending message [message = Heartbeat(99,[Lscala.Tuple2;@6f8e1be0,BlockManagerId(99, x.x.x.x, 47398, None))]
2017-07-10 21:11:46 WARN  Executor:87 - Issue communicating with driver in heartbeater
org.apache.spark.SparkException: Error sending message [message = Heartbeat(87,[Lscala.Tuple2;@6ff2c572,BlockManagerId(87, x.x.x.x, 47063, None))]

Addition:

Task Resources(1 driver + 100 executors) driver: 40G+4cores single executor: 40G+4cores
There are no user-specified spark configurations for this spark task. All using default ones.
It does not look like the network problem as other tasks in the same cluster run normally.

erikerlandson commented 7 years ago

@duyanghao is it a situation where the heartbeats are sending correctly for some period of a time, and then stop working or do they never succeed?

duyanghao commented 7 years ago

@erikerlandson all 100 executors are sending 60*3=180 heartbeats(10s timeout),but never succeed.

duyanghao commented 7 years ago

@erikerlandson i add some debug logs in core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala as below:

diff --git a/core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala b/core/src/main/scala/org/apache/spark/HeartbeatReceiver.scal
index 5242ab6..d802dc7 100644
--- a/core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala
+++ b/core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala
@@ -121,15 +121,19 @@ private[spark] class HeartbeatReceiver(sc: SparkContext, clock: Clock)

     // Messages received from executors
     case heartbeat @ Heartbeat(executorId, accumUpdates, blockManagerId) =>
+      logInfo(s"executor:$executorId heartbeat $blockManagerId received")
       if (scheduler != null) {
         if (executorLastSeen.contains(executorId)) {
           executorLastSeen(executorId) = clock.getTimeMillis()
           eventLoopThread.submit(new Runnable {
             override def run(): Unit = Utils.tryLogNonFatalError {
+              logInfo(s"executor:$executorId heartbeat $blockManagerId scheduler before")
               val unknownExecutor = !scheduler.executorHeartbeatReceived(
                 executorId, accumUpdates, blockManagerId)
               val response = HeartbeatResponse(reregisterBlockManager = unknownExecutor)
+              logInfo(s"executor:$executorId heartbeat $blockManagerId $response reply before")
               context.reply(response)
+              logInfo(s"executor:$executorId heartbeat $blockManagerId $response reply")
             }
           })
         } else {
@@ -137,7 +141,7 @@ private[spark] class HeartbeatReceiver(sc: SparkContext, clock: Clock)
           // after we just removed it. It's not really an error condition so we should
           // not log warning here. Otherwise there may be a lot of noise especially if
           // we explicitly remove executors (SPARK-4134).
-          logDebug(s"Received heartbeat from unknown executor $executorId")
+          logInfo(s"Received heartbeat from unknown executor:$executorId heartbeat")
           context.reply(HeartbeatResponse(reregisterBlockManager = true))
         }
       } else {

and there are following driver logs(after grep executor:1 heartbeat) when i run the task again:

2017-08-01 03:10:55 INFO  HeartbeatReceiver:54 - executor:1 heartbeat BlockManagerId(1, 192.168.5.246, 34451, None) received
2017-08-01 03:11:08 INFO  HeartbeatReceiver:54 - executor:1 heartbeat BlockManagerId(1, 192.168.5.246, 34451, None) received
2017-08-01 03:11:21 INFO  HeartbeatReceiver:54 - executor:1 heartbeat BlockManagerId(1, 192.168.5.246, 34451, None) received
2017-08-01 03:11:31 INFO  HeartbeatReceiver:54 - executor:1 heartbeat BlockManagerId(1, 192.168.5.246, 34451, None) received
2017-08-01 03:11:44 INFO  HeartbeatReceiver:54 - executor:1 heartbeat BlockManagerId(1, 192.168.5.246, 34451, None) received
2017-08-01 03:11:57 INFO  HeartbeatReceiver:54 - executor:1 heartbeat BlockManagerId(1, 192.168.5.246, 34451, None) received
2017-08-01 03:12:07 INFO  HeartbeatReceiver:54 - executor:1 heartbeat BlockManagerId(1, 192.168.5.246, 34451, None) received
2017-08-01 03:12:20 INFO  HeartbeatReceiver:54 - executor:1 heartbeat BlockManagerId(1, 192.168.5.246, 34451, None) received
2017-08-01 03:12:33 INFO  HeartbeatReceiver:54 - executor:1 heartbeat BlockManagerId(1, 192.168.5.246, 34451, None) received
2017-08-01 03:12:40 INFO  HeartbeatReceiver:54 - executor:1 heartbeat BlockManagerId(1, 192.168.5.246, 34451, None) scheduler before
2017-08-01 03:12:41 INFO  HeartbeatReceiver:54 - executor:1 heartbeat BlockManagerId(1, 192.168.5.246, 34451, None) HeartbeatResponse(false) reply before
2017-08-01 03:12:41 INFO  HeartbeatReceiver:54 - executor:1 heartbeat BlockManagerId(1, 192.168.5.246, 34451, None) HeartbeatResponse(false) reply
2017-08-01 03:12:43 INFO  HeartbeatReceiver:54 - executor:1 heartbeat BlockManagerId(1, 192.168.5.246, 34451, None) received
2017-08-01 03:12:56 INFO  HeartbeatReceiver:54 - executor:1 heartbeat BlockManagerId(1, 192.168.5.246, 34451, None) received
2017-08-01 03:13:09 INFO  HeartbeatReceiver:54 - executor:1 heartbeat BlockManagerId(1, 192.168.5.246, 34451, None) received

And at the same time, the executor 1 logs below:

2017-08-01 03:11:05 WARN  NettyRpcEndpointRef:87 - Error sending message [message = Heartbeat(1,[Lscala.Tuple2;@1864841d,BlockManagerId
(1, 192.168.5.246, 34451, None))] in 1 attempts
org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [10 seconds]. This timeout is controlled by spark.executor.heartbeatI
nterval
        at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:48)
        at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:63)
        at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
        at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167)
        at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83)
        at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:102)
        at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$reportHeartBeat(Executor.scala:538)
        at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply$mcV$sp(Executor.scala:567)
        at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:567)
        at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:567)
        at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1951)
        at org.apache.spark.executor.Executor$$anon$1.run(Executor.scala:567)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [10 seconds]
        at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
        at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
        at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)
        at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
        at scala.concurrent.Await$.result(package.scala:190)
        at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:81)
        ... 14 more
2017-08-01 03:11:18 WARN  NettyRpcEndpointRef:87 - Error sending message [message = Heartbeat(1,[Lscala.Tuple2;@1864841d,BlockManagerId
(1, 192.168.5.246, 34451, None))] in 2 attempts
org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [10 seconds]. This timeout is controlled by spark.executor.heartbeatI
nterval
        at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:48)
        at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:63)
        at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
        at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167)
        at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83)
        at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:102)
        at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$reportHeartBeat(Executor.scala:538)
        at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply$mcV$sp(Executor.scala:567)
        at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:567)
        at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:567)
        at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1951)
        at org.apache.spark.executor.Executor$$anon$1.run(Executor.scala:567)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [10 seconds]
        at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
        at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
        at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)
        at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
        at scala.concurrent.Await$.result(package.scala:190)
        at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:81)
        ... 14 more
2017-08-01 03:11:31 WARN  NettyRpcEndpointRef:87 - Error sending message [message = Heartbeat(1,[Lscala.Tuple2;@1864841d,BlockManagerId
(1, 192.168.5.246, 34451, None))] in 3 attempts
org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [10 seconds]. This timeout is controlled by spark.executor.heartbeatI
nterval
        at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:48)
        at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:63)
        at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
        at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167)
        at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83)
        at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:102)
        at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$reportHeartBeat(Executor.scala:538)
        at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply$mcV$sp(Executor.scala:567)
        at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:567)
        at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:567)
        at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1951)
        at org.apache.spark.executor.Executor$$anon$1.run(Executor.scala:567)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [10 seconds]
        at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
        at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
        at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)
        at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
        at scala.concurrent.Await$.result(package.scala:190)
        at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:81)
        ... 14 more
2017-08-01 03:11:31 WARN  Executor:87 - Issue communicating with driver in heartbeater
org.apache.spark.SparkException: Error sending message [message = Heartbeat(1,[Lscala.Tuple2;@1864841d,BlockManagerId(1, 192.168.5.246,
 34451, None))]
        at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:119)
        at org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$reportHeartBeat(Executor.scala:538)
        at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply$mcV$sp(Executor.scala:567)
        at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:567)
        at org.apache.spark.executor.Executor$$anon$1$$anonfun$run$1.apply(Executor.scala:567)
        at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1951)
        at org.apache.spark.executor.Executor$$anon$1.run(Executor.scala:567)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.spark.rpc.RpcTimeoutException: Futures timed out after [10 seconds]. This timeout is controlled by spark.executor
.heartbeatInterval
        at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:48)
        at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:63)
        at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
        at scala.PartialFunction$OrElse.apply(PartialFunction.scala:167)
        at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:83)
        at org.apache.spark.rpc.RpcEndpointRef.askWithRetry(RpcEndpointRef.scala:102)
        ... 13 more
Caused by: java.util.concurrent.TimeoutException: Futures timed out after [10 seconds]
        at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
        at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
        at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)
        at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
        at scala.concurrent.Await$.result(package.scala:190)
        at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:81)
        ... 14 more

i guess it is the problem of eventLoopThread as below:

  // "eventLoopThread" is used to run some pretty fast actions. The actions running in it should not
  // block the thread for a long time.
  private val eventLoopThread =
    ThreadUtils.newDaemonSingleThreadScheduledExecutor("heartbeat-receiver-event-loop-thread")

  /**
   * Wrapper over ScheduledThreadPoolExecutor.
   */
  def newDaemonSingleThreadScheduledExecutor(threadName: String): ScheduledExecutorService = {
    val threadFactory = new ThreadFactoryBuilder().setDaemon(true).setNameFormat(threadName).build()
    val executor = new ScheduledThreadPoolExecutor(1, threadFactory)
    // By default, a cancelled task is not automatically removed from the work queue until its delay
    // elapses. We have to enable it manually.
    executor.setRemoveOnCancelPolicy(true)
    executor
  }

JDK version:

java -version
openjdk version "1.8.0_121"
OpenJDK Runtime Environment (IcedTea 3.3.0) (Alpine 8.121.13-r0)
OpenJDK 64-Bit Server VM (build 25.121-b13, mixed mode)

tangzhankun commented 7 years ago

Perhaps we need to find a simple workload to reproduce this. Will it help if we just let the executor sleep there?

hustcat commented 7 years ago

"heartbeat-receiver-event-loop-thread" #41 daemon prio=5 os_prio=0 tid=0x00007f5b17b3a800 nid=0x2d waiting for monitor entry [0x00007f4f9621c000]
   java.lang.Thread.State: BLOCKED (on object monitor)
    at org.apache.spark.scheduler.TaskSchedulerImpl.executorHeartbeatReceived(TaskSchedulerImpl.scala:406)
    - waiting to lock <0x00007f5003f2f1b0> (a org.apache.spark.scheduler.cluster.kubernetes.KubernetesTaskSchedulerImpl)
    at org.apache.spark.HeartbeatReceiver$$anonfun$receiveAndReply$1$$anon$2$$anonfun$run$2.apply$mcV$sp(HeartbeatReceiver.scala:131)
    at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1283)
    at org.apache.spark.HeartbeatReceiver$$anonfun$receiveAndReply$1$$anon$2.run(HeartbeatReceiver.scala:129)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

We found the driver heartbeat thread is blocked, and so can't handle heartbeat message from executor.

  override def executorHeartbeatReceived(
      execId: String,
      accumUpdates: Array[(Long, Seq[AccumulatorV2[_, _]])],
      blockManagerId: BlockManagerId): Boolean = {
    // (taskId, stageId, stageAttemptId, accumUpdates)
    val accumUpdatesWithTaskIds: Array[(Long, Int, Int, Seq[AccumulableInfo])] = synchronized {
      accumUpdates.flatMap { case (id, updates) =>
        val accInfos = updates.map(acc => acc.toInfo(Some(acc.value), None))
        taskIdToTaskSetManager.get(id).map { taskSetMgr =>
          (id, taskSetMgr.stageId, taskSetMgr.taskSet.stageAttemptId, accInfos)
        }
      }
    }
    dagScheduler.executorHeartbeatReceived(execId, accumUpdatesWithTaskIds, blockManagerId)
  }

However, should only one thread call this code, why thread is blocked?

hustcat commented 7 years ago

@apache-spark-on-k8s/contributors Can any one give some suggestion?

tangzhankun commented 7 years ago

@hustcat Could you please share the reproduce steps so I can try on my cluster?

hustcat commented 7 years ago

We have found the reasons, and we will submit a PR to resolve this problem later:)

duyanghao commented 7 years ago

@kimoonkim the problem is your commit,i drop your commit code,and it works without heartbeat timeout. but i can't figure out the purpose of your commit. could you explain in detail why add following codes:

...
            val clusterNodeFullName = inetAddressUtil.getFullHostName(clusterNodeIP)
            val pendingTasksClusterNodeFullName = super.getPendingTasksForHost(clusterNodeFullName)
            if (pendingTasksClusterNodeFullName.nonEmpty) {
              logDebug(s"Got preferred task list $pendingTasksClusterNodeFullName " +
                s"for executor host $executorIP using cluster node full name $clusterNodeFullName")
            }
            pendingTasksClusterNodeFullName
...