Closed supernomad closed 1 year ago
is this added in nomad UI. Or still in phase.?
+1 to this feature, really need to shut down hashi-ui and use only nomad native, but can't due to unvailabilty of rolling restart
yeah @tgross there is situation where container dependent on consul key-value and if we update key value in consul then after restart our service it will populate new values in out container so we really think this need to be allocated in nomad UI and get rid of hashiui . don't need to maintain two UI for nomad
are we supposed to think this is on our roadmap
+1 to this feature
The way hashi-ui implements this is by injecting a label into the job, which messes with nomad job plan
as the same job will result in a change as the local job won't have the injected label.
Yes they are adding meta param like date time stamp
On Wed, 1 Sep 2021, 13:19 Jose Diaz-Gonzalez, @.***> wrote:
The way hashi-ui implements this is by injecting a label into the job, which messes with nomad job plan as the same job will result in a change as the local job won't have the injected label.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/hashicorp/nomad/issues/698#issuecomment-910029887, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADU5G3LCV2KYJY6AXQFFCJTT7XLGZANCNFSM4BZJUGLQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
The way hashi-ui implements this is by injecting a label into the job, which messes with
nomad job plan
as the same job will result in a change as the local job won't have the injected label.
An easy CLI subcommand / HTTP API call that function would be very handy.
I ended up getting what I wanted (a rolling restart of an existing application) using the following python snippet and the Nomad HTTP API
get_job_url = NOMAD_URL + os.path.join('/v1/job',job_id)
get_job_response = requests.get(get_job_url)
job = get_job_response.json()
if 'Meta' not in job or job['Meta'] is None:
job['Meta'] = {}
job['Meta']['Restart'] = str(time.time())
job = { 'Job': job, 'PreserveCounts': True }
# now post it back
post_url = NOMAD_URL + os.path.join('/v1/jobs')
post_job_response = requests.post(post_url,json=job)
print('restart job response',post_job_response.json())
Unless I'm overlooking a possible drawback, the command suggested by @mxab looks good to me. You can use any variation of the command and add it onto your shell aliases:
nomad job status <job-name> | awk '{if (/run(.*)running/) {system("nomad alloc restart " $1)}}'
nomad job status <job-name> | awk '/run(.*)running/{print $1}' | xargs -t -n 1 nomad alloc restart
Unless I'm overlooking a possible drawback, the command suggested by @mxab looks good to me. You can use any variation of the command and add it onto your shell aliases:
nomad job status <job-name> | awk '{if (/run(.*)running/) {system("nomad alloc restart " $1)}}' nomad job status <job-name> | awk '/run(.*)running/{print $1}' | xargs -t -n 1 nomad alloc restart
As I understand "nomad alloc restart" doesn't re-download artifacts and docker images ? I need to restart a job with an actual docker image.
Doing some issue cleanup and realizing there's a whole lot of different feature requests being discussed in this issue over the years, many of which landed long ago. I'm going to re-title this issue to narrow the scope to the remaining request.
With the command above I can not restart the task which failed:
$ nomad job restart -task nginx-task portal
==> 2024-02-23T11:13:56-05:00: Restarting 1 allocation
2024-02-23T11:13:56-05:00: Restarting task "nginx-task" in allocation "27caddf2" for group "services"
==> 2024-02-23T11:13:56-05:00: Job restart finished with errors
1 error occurred while restarting job:
* Error restarting allocation "27caddf2": Failed to restart task "nginx-task": Unexpected response code: 500 (Task not running)
$ nomad alloc restart 27caddf2
Failed to restart allocation:
Unexpected response code: 500 (restart of an alloc that should not run)
It is not clear how to restart failed task?
Please open a new issue for that. This issue is many years old and closed :)
So I would love the ability to restart tasks, at the very least restart an entire job, but preferably single allocations. This is very useful for when a particular allocation or job happens to get in a bad state.
I am thinking something like
nomad restart <job>
ornomad alloc-restart <alloc-id>
.One of my specific use cases, is I have a cluster of rabbitmq nodes, and at some point one of the nodes gets partitioned from the rest of the cluster. I would like to restart that specific node (allocation in nomad parlance), or be able to preform a rolling restart to the entire cluster (job in nomad parlance).
Does this sound useful?