Closed drscre closed 7 years ago
Hey I am going to close this since a lot of the docker issues around docker images were fixed in 0.5.2+. As for why it gets restarted, the server detects the client is gone and tries to place it on a new machine. You just so happen to have a small enough cluster that it replaced on the same node.
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.
Nomad v0.5.1
I have two machines: nomad server (bootstrap = 1) nomad client which runs docker tasks.
I was investigating why my tasks ended up dead after running for a while.
It boiled down to a connectivity issue. When network connection between client and server fails, task group enters "lost" state. When later client machine rediscovers nomad server, Nomad presumably tries to restart task group and fails with "Failed to create container: no such image"
docker log: (registry.lingualeo-funk.com/config-service:dev-301 is an image used by task)
Nomad status during connection failure:
Nomad status after connection is back again:
PS. I don't quite get why Nomad client tries to restart task? Why not just leave it running?