Open felka opened 6 years ago
@felka can you provide the template (even a sanitized version is fine). Nomad uses consul-template which does a blocking query to read the key. I suspect that this is causing the template to never finish rendering. This is expected behavior though given how consul template works.
@preetapan Thanks for prompt answer! I understand it is expected behavior. Also we are using 0.14 consul template which has non blocking query in "key". However, I think this behavior should has a config to allow job to fail instead of being in pending/running state and using allocation. Blocking allocation because of missing key seems to me too aggressive as a default.
I would definitely like the option for the nomad job to fail if keys are missing (maybe as an extra param to the template stanza).
I had the same issue while evaluating nomad and resorted to doing a pre-flight check on the manifest using this script.
If the output is not empty then you have missing keys and can throw an error before attempting the deploy.
@felka - Do you use keyOrDefault in your template? https://github.com/hashicorp/consul-template#keyordefault - that should let it progress with a default value instead of a blocking query.
As I understand it, the details of your template are opaque to Nomad and it only executes it. Consul template has three options - a blocking query, a keyExists check so that you can do flow control based on whether a key is present, and the keyOrDefault which lets you provide a default value. If the template execution does not terminate, Nomad cannot determine the right behavior. For example, what if you want to wait for something else to populate that key so that the config file for the job can be populated correctly, and the job can then run after that? In that case, its acceptable to block allocations until its ready to proceed.
That being said, we will discuss this internally to see what else we can do here.
how about adding error_on_missing_key
options to template
stanza?
It seems this should be supported in consul-template now? Not sure if the issue is here or in https://github.com/hashicorp/terraform-provider-nomad but attempting to use it in template
:
template -> invalid key: error_on_missing_key
Since Consul Template supports error_on_missing_key, we should be able to support this by adding the key to the Consul Template config struct and threading the value through to Consul Template.
I am going to add a help-wanted and good-first-issue label in case somebody wants to take this!
Nomad version
0.7.0-rc3
Operating system and Environment details
ubuntu 16.04
Issue
We are working on moving our cronjobs to Nomad. It includes scheduling scala jars using native exec (no containers). The job itself contains template stanza which renders a job config file with keys from consul before execution. If a key is missing the job will get stuck in pending state and prevents other allocations from running. However when adding new jobs they seem to be running while older pending jobs doesn't get allocated. There is also an issue in the UI which is showing the jobs which are stuck in "running" state
Reproduction steps
Adding 100 new jobs with missing key to have running and pending jobs Waiting until some failing jobs will be allocated Add new working jobs without missing consul key to test allocation
Error in UI
rv-job pending Missing: kv.block(path_to_key/producerConfigs)
Job file