Open kneufeld opened 3 years ago
Hi @kneufeld! Is your check_restart.limit
set to 1? It looks from the documentation that we shouldn't be checking at all until the grace period. Can you provide the jobspec?
Is your check_restart.limit set to 1?
@tgross As a matter a fact it is...
I'll email the jobspec.
Just wanted to circle back on this @kneufeld: I've been trying to reproduce the problem by building up a jobspec incrementally with all the features in the one you've provided and haven't yet hit on the source of the issue. I see you have #9307 open as well so hopefully folks in that thread will have some more data for you as well.
Thank you for taking a look. This isn't a major issue, just something I felt should be documented.
I'm finding a similar thing happen to one of my jobs. I have a MySQL container running. MySQL does a temporary startup/shutdown cycle before starting properly. To skip this, based on the docs, the grace flag in the check_restart stanza should delay the health checks failing to recognise the service. But that's not happening. I'm seeing failed health check notices in Consul's log output despite the grace period of 60s not having passed.
Nomad version: v1.0.4 (9294f35f9aa8dbb4acb6e85fa88e3e2534a3e41a) Consul version: v1.9.4 (10bb6cb3b)
Here's a timeline of the data, based on different data sources:
Date/Time | Nomad | Consul | MySQL |
---|---|---|---|
14:28:14 | Task received by client | ||
14:28:14 | Building task directory | ||
14:28:14 | Downloading image | ||
14:28:54 | Task started by client | ||
14:28:54 | [temporary] init server in progress | ||
14:29:00 | Health check failed | ||
14:29:03 | [temporary] ready for connections | ||
14:29:07 | [temporary] shutdown started | ||
14:29:09 | [temporary] shutdown complete | ||
14:29:10 | Health check failed | ||
14:29:10 | init server in progress | ||
14:29:11 | ready for connections |
job "mysql-server" {
datacenters = ["dc1"]
type = "service"
group "mysql-server" {
count = 1
network {
port "db" {
static = 3306
}
}
task "mysql-server" {
driver = "docker"
config {
image = "mysql:8.0.22"
ports = ["db"]
# Add nice cap to prevent minutely logging
cap_add = ["SYS_NICE"]
}
env {
MYSQL_ROOT_PASSWORD = "password"
}
resources {
cpu = 100
memory = 1000
}
service {
name = "mysql-server"
port = "db"
check {
type = "tcp"
interval = "10s"
timeout = "2s"
check_restart {
grace = "60s"
}
}
}
}
}
}
I can confirm that in Nomad v1.0.6 (592cd4565bf726408a03482da2c9fd8a3a1015cf) updating a task leads to restarts due to slow start. We also have check_restart.grace which is longer than application is ready but we have similar problem and similat timeline as @spaulg has.
We can confirm the same issue on Nomad v1.1.4.
The same issue there with 1.5.1. The grace period is not honored, task is marked as unhealthy just a few seconds after Nomad reports "task started".
Anything we can do to help with this?
Job spec:
{
"ID": "application",
"Name": "application",
"Type": "service",
"Datacenters": [
"dev"
],
"Meta": {
"timestamp": "1681980389",
"version": "cb84602"
},
"Constraints": [
{
"LTarget": "${node.class}",
"RTarget": "build",
"Operand": "!="
}
],
"Update": {
"Stagger": 10000000000,
"MaxParallel": 1,
"HealthCheck": "checks",
"MinHealthyTime": 25000000000,
"HealthyDeadline": 100000000000,
"AutoRevert": true,
"AutoPromote": true,
"Canary": 1
},
"TaskGroups": [
{
"Name": "application",
"Count": 1,
"Networks": [
{
"Mode": "bridge",
"DynamicPorts": [
{
"Label": "healthcheck",
"Value": 0,
"To": -1,
"HostNetwork": "default"
},
{
"Label": "envoy-metrics",
"Value": 0,
"To": 9102,
"HostNetwork": "default"
}
]
}
],
"Services": [
{
"Name": "application",
"Port": 80,
"PortLabel": "80",
"AddressMode": "auto",
"Tags": [
"namespace=default",
"service=application",
"awesomejobid=e27853c7-8a3b-4332-9efc-8aa835775c93",
"external",
"mesh",
"prometheus_metrics_path=/metrics",
"prometheus_metrics_address=${attr.unique.network.ip-address}:${NOMAD_PORT_metrics}",
"prometheus_metrics=false"
],
"Checks": [
{
"Type": "http",
"Protocol": "http",
"Port": null,
"Path": "/api/health",
"AddressMode": "",
"Interval": 10000000000,
"Timeout": 2000000000,
"CheckRestart": {
"Limit": 2,
"Grace": 30000000000,
"IgnoreWarnings": true
},
"PortLabel": "healthcheck",
"Expose": true
}
],
"Connect": {
"Native": false,
"SidecarService": {
"Tags": [
"mesh",
"ingress",
"prometheus_metrics_address=${NOMAD_HOST_ADDR_envoy-metrics}",
"prometheus_metrics_path=/metrics"
],
"Proxy": {
"ExposeConfig": {
"Path": [
{
"Path": "/api/health",
"Protocol": "http",
"LocalPathPort": 80,
"ListenerPort": "healthcheck"
}
]
},
"Upstreams": [
{
"DestinationName": "backend-app",
"LocalBindPort": 13370
}
],
"Config": {
"envoy_prometheus_bind_addr": "0.0.0.0:9102",
"local_request_timeout_ms": 0
}
}
},
"SidecarTask": {
"ShutdownDelay": 10000000000,
"KillTimeout": 10000000000,
"Config": {
"labels": [
{
"environment": "dev",
"namespace": "default",
"service": "application-connect-proxy"
}
]
},
"Resources": {
"CPU": 50,
"MemoryMB": 96
}
}
}
}
],
"Tasks": [
{
"Name": "application",
"Vault": null,
"Driver": "docker",
"Config": {
"network_mode": null,
"image": "the-image:123abc",
"volumes": [
"/usr/local/share/ca-certificates/:/certs/"
],
"logging": [],
"labels": []
},
"Services": null,
"Env": {
"NAMESPACE": "default"
},
"Templates": [],
"Resources": {
"CPU": 100,
"MemoryMB": 256
},
"RestartPolicy": {
"Attempts": 10,
"Interval": 1800000000000,
"Delay": 15000000000,
"Mode": "fail"
},
"Artifacts": [],
"KillTimeout": 10000000000,
"ShutdownDelay": 10000000000
}
],
"EphemeralDisk": {
"SizeMB": 300
}
}
]
}
Hi, I noticed the same thing. I was expecting nomad would wait for the grace period before restarting the allocation, but it didn't.
This happened not at the first start, but after some time running, the application failed and during starting it waited less than grace + check interval * times.
Looking at the code and at the logs, I noticed one thing that I don't know if it's intentional (in that case I didn't saw that documented) or a bug. In case of a restart after the first boot, the grace interval contains the restart delay. So if the delay is 25s and the check_restart grace is 60s. The application would wait for a useful grace of 35 seconds and then the check restart limits are applied (check interval * (times -1)).
Hey @jorgemarey, thanks for the sleuthing regarding the restart delay and grace. I think we'll probably want to change behavior, but at least document it clearly if not. Added to our todo list, but not in the immediate todo column.
It will realistically be at least a month before we could take this up, so if anybody wants to take a crack at making sure the grace period accounts for the delay, let us know, we'd accept a PR!
Hi @mikenomitch, I'll try to make a PR for this if I find the time.
Hi @mikenomitch, I was looking into this again. I saw that this only happens when the check is defined at a group level (if it's defined at task level the delay applied is correct) I think the problem is that the group service hook is executed on the prerun stage So it's registered as the checkRestarter executed before the restart is completed.
Could the group_service_hook RegisterWorkload be moved to poststart fase as it's in the service_hook? I don't know if this conflicts with other things, but should solve this issue. At least what I'm referring to, that is that the check_restart.grace contains the restart.delay
Nomad version
Nomad v1.0.0-beta3 (fcb32ef7ba037b2b83e7a0fd1c53c7410a2990db)
Operating system and Environment details
Ubuntu 18.04.5 LTS
Issue
During startup the task is occasionally killed for being unhealthy despite:
check_restart { grace = "30s" }
timemigrate { healthy_deadline = "10m" }
timeReproduction steps
Only happens occasionally
Job file (if appropriate)
Complicated job with 5 tasks, 1 job per task, connect proxies between them. Can email if desired.
Nomad Client logs (if appropriate)
From
hashi-ui
Nomad Server logs (if appropriate)
Can't find anything good. Hashi-ui relative logs vs journald absolute logs are annoying. This is actually from Friday (3 days ago).