Closed ghukill closed 6 years ago
FWIW, running Livy in local[*]
or local
does not help. Maintaining cluster.
Building on this, would be beneficial to have ability to stop Job, in addition to deleting which also should stop Job.
Re: stopping Spark Job, might be more effective to stop in Spark application as opposed to canceling statement from Livy.
Using Job.get_spark_jobs
, get a response like this:
[{'duration': 71,
'duration_s': '0:01:11',
'jobGroup': '842',
'jobId': 20,
'name': 'foreachPartition at MongoSpark.scala:117',
'numActiveStages': 1,
'numActiveTasks': 1,
'numCompletedStages': 1,
'numCompletedTasks': 11,
'numFailedStages': 0,
'numFailedTasks': 0,
'numSkippedStages': 0,
'numSkippedTasks': 0,
'numTasks': 412,
'stageIds': [33, 34, 35, 32],
'status': 'RUNNING',
'submissionTime': '2018-10-03T14:16:06.308GMT'}]
Using the jobId
attribute, can "kill" the Job in Spark app using the following URL pattern, using the jobid
from the output above:
http://HOST:4040/jobs/job/kill/?id=22
Done.
Moving into Spark Cluster, might need additional calls to kill a Job.
Until now, this was sufficient to stop Job:
However, that same command now shows statement in Livy UI as
cancelled
, but the job in Spark Application continues. There is a bit of a disconnect from Livy statement to Spark Application Job.However, we know the JobGroup and can kill it from the Spark application with this URL:
This returns:
But even then, the Job continues. The only surefire way is to kill Livy Sesssion / Spark Application. Is this reason enough to revert back to
local[*]
mode vs. standalone cluster?