Closed EugenKon closed 1 year ago
Hi @EugenKon 👋
This is the expected default behaviour. From the docs:
By default, on successful job submission the run command will enter an interactive monitor and display log information detailing the scheduling decisions, placement information, and deployment status for the provided job if applicable (batch and system jobs don't create deployments). The monitor will exit after scheduling and deployment have finished or failed.
Since the job deployment is being monitored, the command will exit once the deployment fails or succeeds. For example, if capacity becomes available then the deployment will be able to complete.
You can use the -detach
to skip deployment monitoring.
@lgfa29
Probably it would be helpful, if in this case nomad job run
will display a message like waiting nodes with available resources
. So it will not look like halted.
Or replace Deployment "fec89547" in progress...
with Deployment "fec89547" waiting for available nodes...
/Deployment "fec89547" in waiting for nodes with requested resources by a job...
This is ambiguous to see that deployment is in progress, when actually it is not.
This is what this output is describing.
==> 2023-08-22T10:38:53-04:00: Evaluation "6c9de6e6" finished with status "complete" but failed to place all allocations:
2023-08-22T10:38:53-04:00: Task Group "cache" (failed to place 1 allocation):
* No nodes were eligible for evaluation
2023-08-22T10:38:53-04:00: Evaluation "ae4156ba" waiting for additional capacity to place remainder
The deployment is still running in progress, and will complete once additional capacity is available.
Ok, to me it is ambiguous. If log says that status changes
then this change should be signalled accordingly:
or
For example when I going
to work. I am:
All these stems are going
(similar to Deployment in progress
). But my actual status is walking
, waiting
, sitting
, reaching
. It would be more informative, if your tool will display more exact status, instead of general in progress
. Because until command exits, it is in progress
anyway.
I noticed that confusing Deploment "xxx" in progress
not exits even if config was not parsed by nomad:
job "nomad-postgres" {
datacenters = ["dc1"]
type = "service"
group "backend" {
count = 1
network {
# mode = "host"
port "db" {
static = 5432
to = 5432
}
}
# https://developer.hashicorp.com/nomad/tutorials/stateful-workloads/stateful-workloads-host-volumes#create-the-job-file
volume "postgres-volume" {
type = "host"
read_only = false
source = "vol-postgres"
}
volume "postgres-backup-volume" {
type = "host"
read_only = false
source = "vol-postgres-backup"
}
task "postgres-task" {
driver = "docker"
# https://developer.hashicorp.com/nomad/docs/drivers/docker
config {
force_pull = false
image = "private/ourpostgres"
# ports = ["db"]
# network_mode = "host"
# https://developer.hashicorp.com/nomad/docs/drivers/docker#authentication
# https://developer.hashicorp.com/nomad/docs/drivers/docker#client-requirements
auth {
username = ""
password = ""
}
command = postgres
args = [
"-d", "5",
"-c", "max_connections=50",
"-c", "shared_buffers=256MB",
"-c", "logging_collector=on",
"-c", "log_destination=stderr",
# -c log_directory=/logs
]
# sysctl = {
# "net.core.somaxconn" = "16384"
# "kernel.shmmax" = 1610612736
# "kernel.shmall" = 393216
# }
# labels {
# foo = "bar"
# zip = "zap"
# }
# mount {}
# devices {}
# cap_add {}
}
volume_mount {
volume = "postgres-volume"
destination = "/dbdata/postgres"
read_only = false
}
volume_mount {
volume = "postgres-backup-volume"
destination = "/backup"
read_only = false
}
env = {
SERVICE_NAME = "postgres-node"
SERVICE_ID = 1
API_DEBUG = true
S3_REGION = us-west-2
CONTINOUS_BACKUP = s3
BACKUP_S3 = true
}
service {
tags = ["postgres"]
name = "postgres-service"
port = "db"
provider = "consul"
}
}
}
}
So nomad should not try to deploy because it is not even known what to deploy
Ok, to me it is ambiguous. If log says that status changes then this change should be signalled accordingly:
The status that changed was for the evaluation, which is in the green rectangle. The deployment status did not change and will remain running
until it either completes or fails.
Check the glossary and the scheduling concepts page if you would like to learn more about them.
So nomad should not try to deploy because it is not even known what to deploy
Nomad is a complex system with many moving parts. At the time the job is submitted it's not possible to know if the config
block is correct because that is validated by the task driver, which run on clients.
https://github.com/hashicorp/nomad/issues/18271 is something that can help wit this.
Nomad version
Output from
nomad version
Operating system and Environment details
Issue
When there is no any clients then
nomad job run
never exits (see actual results below)Reproduction steps
Run only servers. Actually I configure nomad client to mount a host volume into docker driver, but forget to create this directory on host machine. Thus nomad service was not able to start.
Expected Result
command should exit after some timeout or error
Actual Result
UI:
Nomad Client logs (if appropriate)
Actual and expected error message: