It happens every now and then that a build fails because there was a (transient) issue to retrieve an image.
It would be nice to retry the image download, e.g., have three tries (maybe with a 5 sec and 30 sec back-off respectively) and only fail if the image is still not available then. I'm not sure about what docker reports, but we are seeing 5xx errors if we cannot get an image, if that could be distinguished from 4xx then those could fail immediately, no point in retrying once we got a definite answer from the other side.
It happens every now and then that a build fails because there was a (transient) issue to retrieve an image.
It would be nice to retry the image download, e.g., have three tries (maybe with a 5 sec and 30 sec back-off respectively) and only fail if the image is still not available then. I'm not sure about what docker reports, but we are seeing 5xx errors if we cannot get an image, if that could be distinguished from 4xx then those could fail immediately, no point in retrying once we got a definite answer from the other side.
This Fawkes build #73 shows the problem. A simple retry of the step a little later "fixed" it. Even though this was with 1.4.0, the current code has not changed and still tries only once. https://github.com/buildkite-plugins/docker-buildkite-plugin/blob/a9ca37e8dab9fb248ec2f2e8cae27787d7b25856/hooks/command#L180-L183