Closed Maxim-Filimonov closed 9 years ago
Ah, that's been fixed in the newest agent.
I've just triggered a build on docker hub: https://registry.hub.docker.com/builds/43047/buildkite/docker-buildkite-agent/builds/bbyvpy8o2zezfmwaod3cfkn/
Let me know if that build fixes it for you. It should install the latest from apt.
$ docker pull buildkite/agent
Pulling repository buildkite/agent
9a56c031529b: Download complete
511136ea3c5a: Download complete
fa4fd76b09ce: Download complete
1c8294cc5160: Download complete
117ee323aaa9: Download complete
2d24f826cb16: Download complete
a5d75d881ba0: Download complete
65a4c2d3ed7c: Download complete
4203cb4966e1: Download complete
d82bdf6a5667: Download complete
547487222ec8: Download complete
cd65d1584e2b: Download complete
74c35b3f8c9f: Download complete
f850860f833c: Download complete
37943d54e6c2: Download complete
Status: Downloaded newer image for buildkite/agent:latest
$ docker run -it --rm buildkite/agent buildkite-agent --version
buildkite-agent version 1.0-beta.16.356
Looking good! Also I made it so new agent Buildkite pipeline releases happen to Docker automatically as well in https://github.com/buildkite/agent/commit/8b6fb8edca1753bbbabe30f3b715895891d83255
Hm now it's just stuck :(
Pulling image mongo:2.6.4...
5a7d9470be44: Extracting [==========================================> ] 76.32 MB/90.29 MB 0
feb755848a9a: Download complete
1627d948a24f: Download complete
fa822d72bf37: Download complete
49b4737cc97a: Download complete
eb6f031b83d2: Download complete
f1adb48b243c: Download complete
f50aefc52a66: Downloading [> ] 1.584 MB/289.4 MB 12m8s
25602f75729f: Download complete
3c3845bec813: Download complete
004493629d34: Download complete
c6653268ebba: Download complete
3fd4d8f5b76b: Download complete
3fd4d8f5b76b: Pulling fs layer
511136ea3c5a: Already exists
Has been like that for last 42 min. Mongo image is only 300 mb i doubt it takes so long to download. I think agent is stuck again :(
Started a new agent and it worked. Looks like when the image is too big agent dies somewhere in the middle of the process :\
Hmm how'd you start the agent? I saw that when I started it with run -it
but without that it worked okay…
@dekz have you seen agents going awol whilst downloading large amounts of fs layers?
Well I've certainly seen docker take its time to download a layer, whether that is an issue of the docker daemon or of the registry itself, I am not sure. It certainly isn't a frequent occurrence that I have noticed.
That being said, we do run our own docker registry, @Maxim-Filimonov is that just pulling from the Official Docker Registry?
@dekz yep and if i pull the image outside of buildkite-agent it works ;\
That's quite strange. I have had issues before where the client initiated the Docker Pull and then bailed out, I reinitiated the Pull to which the Docker Daemon told me it was already pulling, then it hung.
We are using fig though, but this seems like it would be more of a Daemon issue to me. Might be worth just making sure that its not waiting on free space to write the FS layers to disk.
@dekz So what I did is shutdown the buildkite agent which should not affect docker pull because downloading is done by docker service running on the host machine. Then I tried
docker pull mongo
and it pulled mongo image with all layers being cached. Meaning it was pulled successfully but for some reason either docker-compose or buildkite did not wait get the response back. I'll try to investigate further.
Ok after further investigation I think that it's definitely buildkite-agent doing something weird. I did this to test it on buildkite-agent container:
docker rmi $(docker images | grep mongo | awk '{print $1'})
docker bundle exec -it <BUILD_KITE_AGENT_CONTAINER_ID> /bin/bash
cd <PROJECT_DIR>
See logs for details I remove mongo image again and try leave it to buildkite-agent now and It gets stuck on pulling the image again. Replicated 3 times in a row already. The buildkite agent process on the container gets completely stuck. I can't even kill it with -9. Need to kill the whole container.
Could you try running the agent with --no-pty and see if the agent gets stuck again?
@keithpitt That works ! :+1:
Okay… we need to add this to the docker image!
@toolmantim I wouldn't that's our test logs now:
docker-compose -p buildkitec5ac39ac09a74fbfab8243c5e830de7b run app ./run_tests.sh
Creating buildkitec5ac39ac09a74fbfab8243c5e830de7b_redis_1...
Pulling image redis:2.8.17...
redis:2.8.17: The image you are pulling has been verified. Important: image verification is a tech preview feature and should not be relied on to provide security.
Status: Downloaded newer image for redis:2.8.17
Creating buildkitec5ac39ac09a74fbfab8243c5e830de7b_mongo_1...
Pulling image mongo:2.6.4...
mongo:2.6.4: The image you are pulling has been verified. Important: image verification is a tech preview feature and should not be relied on to provide security.
Status: Downloaded newer image for mongo:2.6.4
So it works but logs are useless now :\
So it sounds like there's something weird going on with the PTY. The PTY is what does all the progress bars, and I only think Docker outputs them if it's on.
Have you tried running the agent with --debug
(with PTY on) and see if there's anything in the logs when it dies? Maybe's there's an error that'll help us figure out the problem. Sorry about this problem, I'm keen to get it fixed for you ASAP!
Just tried with latest version this is still happening :\
I believe this should be good with the latest dockers!
I've noticed that our agent dies when trying to run
docker-compose run
on docker. Host system: CoreOS beta (607.0.0) Docker: 1.5. Buildkite agent: buildkite-agent v1.0-beta.13.328errors.errorString: bufio.Scanner: token too long
I've tried to rerun build but it seems like this busts agent log output to buildkite completely :(