screwdriver-cd / screwdriver

An open source build platform designed for continuous delivery.
http://screwdriver.cd
Other
1.02k stars 169 forks source link

[Bug] (HTTP code 404) no such container - No such image: screwdrivercd/launcher #1513

Open bmauldin opened 5 years ago

bmauldin commented 5 years ago

Environment:

Screwdriver Service Versions:

STORE_VERSION=v3.3.16 UI_VERSION=v1.0.366 API_VERSION=v0.5.555 LAUNCH_VERSION=v5.0.56

What happened:

When first deploying Screwdriver stack to Docker Swarm (version does not matter) and running the very first pipeline added, the build will hang at sd-setup-init and the UI throws the following error:

(HTTP code 404) no such container - No such image: screwdrivercd/launcher

When I run 'docker container list -a' I do not see the screwdriver launcher or the job's container (I.E node:9 in my case). After a little while of waiting, I run docker image list and I do see both the launcher image and the node image listed in my local repository (indicating that they have been properly pulled.

If I go and run the job a second time, it runs through successfully. I've looked a bit through the launcher code (no previous javascript experience) and from what I can tell it seems to have something to do with the API timing out before the images are fully downloaded and pulled from docker hub. Somewhere something is trying to create the container before the launcher and job images are fully downloaded, and then crashing out.

What you expected to happen:

The job to start and run.

How to reproduce it:

  1. Clear out all docker containers and images from all Swarm nodes:

    docker container rm $(docker container list -a -q)
    docker image rm $(docker image list -q)
  2. Deploy the screwdriver stack as defined in your given docker compose file.

    docker stack deploy -c <compose file path> screwdriver
  3. Add a new pipeline for any repo with a screwdriver file.

  4. Start a a new build. Job will hang and UI will throw the error described above.

jithine commented 5 years ago

I believe launcher image needs to be pre-pulled into the cluster when using under Docker. Is it only the first time run ? Are you facing the issue on subsequent runs ?

bmauldin commented 5 years ago

When the image is pre-pulled, this issue does not show up. However, a similar issue arises for the job images (I.E node:9 in this case). The job still hangs because it will not start the container on the first initial run. For reference, with the launcher pre-pulled, and the very first time run in clean environment, I get the following error:

(HTTP code 404) no such container - No such image: node:9

If I now run it again, the job runs through with success. So even if the launcher is already there, on any new pipeline that uses an image that is not in the local repository, this issue will always show up.

bmauldin commented 5 years ago

For reference on why this is a pretty big deal for me, I am running screwdriver in production at my company, and doing so in HA. If I have x nodes in the Swarm, then every single node has to have both the launcher pre-pulled + any job image pre-pulled for the job to run through with success. if I have y pipelines and z jobs (which in our case can be hundreds), then every single time someone adds a pipeline with images that my x nodes don't have pre-pulled, the build has a chance to fail up to x times before success. Not a massive deal, but definitely not fun to deal with and explain to end-users of the system :/.

jithine commented 5 years ago

docker swarm is not something we run in our build clusters, so we would have to evaluate what is going on. We use Kubernetes as build executor.

Also there is an executor nomad https://github.com/lgfausak/executor-nomad which can interface with docker-swarm as executor.

AkMo3 commented 3 years ago

@jithine I recently encountered this same issue. It can be solved by just pulling the screwdrivercd/launcher image from docker. The application prompts this error message as it is unable to find this image.

So, I believe you can close this issue.

jithine commented 3 years ago

@AkMo3 Thank you for checking out. Issue is that executor should be able to handle this without users having to do a manual pulling of the image. This would be a bug with https://github.com/screwdriver-cd/executor-docker/ that it's not handling absence of launcher image, properly.