Open hbceylan opened 6 years ago
Hi! In order to get community help with this would you mind posting on either the users mailing list users@dcos.io or Slack at chat.dcos.io? I don't know too much about Traefik but you might find someone there who does 🙂
@deric ^
@hbceylan Which Traefik package version do you use?
In the latest version there's a healthcheck configured on $PORT0
:
"healthChecks": [
{
"gracePeriodSeconds": 20,
"intervalSeconds": 5,
"maxConsecutiveFailures": 2,
"portIndex": 0,
"timeoutSeconds": 2,
"delaySeconds": 15,
"protocol": "MESOS_HTTP",
"path": "/ping"
}
],
in your case it appears that port 80
("portIndex": 0
) is used for public connections and does not respond to /ping
(healthcheck request). Port 8080
is probably the "admin" interface entrypoint, that is configured to respond to healthchecks. Judging from the screenshot you should probably use:
"portIndex": 1,
or reorder ports, so that healthchecks will pass (check error log). Also when you use:
"upgradeStrategy": {
"minimumHealthCapacity": 0.5
},
it means that you'll need at least 2 public nodes, because you're allocating fixed ports 80,443,8080
which can't be allocated to multiple instances at the same time. When restarting task Marathon will kill one instance, stage the job and wait until healthcheck passes, then restart the remaining instance(s).
What's going on when the restart traefik instances on dcos? Our microservices are unreachable? Yes! How can I handle this?