quintilesims / layer0

Build, Manage, and Deploy Your Applications
Apache License 2.0
44 stars 20 forks source link

Container exit codes should be non-0 when hitting "CannotPullContainerError" #389

Closed zpatrick closed 6 years ago

zpatrick commented 7 years ago

Expected behavior

A task's exit code should be non-zero when a CannotPullContainerError error occurs.

Actual behavior

When tasks get a CannotPullContainerError, the associated exit code is 0 (since the task never started).

{
     "container_name": "foo",
     "exit_code": 0,
    "last_status": "STOPPED",
    "reason": "CannotPullContainerError: Error: image foo/foo not found"
}
diemonster commented 6 years ago

In testing, I noticed some additional behavior we should consider fixing.

For one, CannotPullContainerError doesn't actually return an exit code, which is why we return 0. That's easy enough to fix, however we're oddly handling other non-zero errors for containers. For example, I created a task def with a container's COMMAND that would intentionally return a 1 exit code (e.g. cat file). It returned as 1 via the AWS ECS Console, but we still return "COMPLETED" via Layer0.

It seems to we need another column of information that gives users' the container's exit code (or lack thereof). Otherwise, I'm not clear on what we're really telling users via the "STATUS" column other than the task was processed by the ECS API.

zpatrick commented 6 years ago

I'm not seeing any unintended behavior in the first paragraph: the task finished running, it exited with code 1, and we reported that information back to the user. There is already an ExitCode field in the Container model in the api-refactor branch.

diemonster commented 6 years ago

Will address UX issue here https://github.com/quintilesims/layer0/issues/508

tribaljack commented 6 years ago

fixed, closing