d2iq-archive / marathon

Deploy and manage containers (including Docker) on top of Apache Mesos at scale.
https://mesosphere.github.io/marathon/
Apache License 2.0
4.07k stars 843 forks source link

marathon cannot pull docker image from dockerhub #3869

Closed anhcuong closed 8 years ago

anhcuong commented 8 years ago

Hi guys,

I am trying to run chronos using chronos docker image on dockerhub. The configuration for marathon is simple and I dont think it is a problem.

However, I noticed that I cannot pull the docker image if I run the application from marathon. But from docker pull command, of course I can pull.

Below is the docker log I receive when running from marathon:

time="2016-05-06T09:54:29.507847570Z" level=error msg="Handler for GET /v1.23/containers/mesosphere/chronos:chronos-2.5.0-0.1.20160223054243.ubuntu1404-mesos-0.27.1-2.0.226.ubuntu1404/json returned error: No such container: mesosphere/chronos:chronos-2.5.0-0.1.20160223054243.ubuntu1404-mesos-0.27.1-2.0.226.ubuntu1404"
time="2016-05-06T09:54:29.641935249Z" level=error msg="Handler for GET /v1.23/images/mesosphere/chronos:chronos-2.5.0-0.1.20160223054243.ubuntu1404-mesos-0.27.1-2.0.226.ubuntu1404/json returned error: No such image: mesosphere/chronos:chronos-2.5.0-0.1.20160223054243.ubuntu1404-mesos-0.27.1-2.0.226.ubuntu1404"
time="2016-05-06T09:54:29.865007222Z" level=info msg="Layer sha256:fd0e26195ab2d3543ed33098e068c617dbe49985108fb4fbb77032abecb59755 cleaned up"
time="2016-05-06T09:54:34.475431152Z" level=error msg="Handler for GET /v1.23/containers/mesosphere/chronos:chronos-2.5.0-0.1.20160223054243.ubuntu1404-mesos-0.27.1-2.0.226.ubuntu1404/json returned error: No such container: mesosphere/chronos:chronos-2.5.0-0.1.20160223054243.ubuntu1404-mesos-0.27.1-2.0.226.ubuntu1404"
time="2016-05-06T09:54:34.476089005Z" level=error msg="Handler for GET /v1.23/images/mesosphere/chronos:chronos-2.5.0-0.1.20160223054243.ubuntu1404-mesos-0.27.1-2.0.226.ubuntu1404/json returned error: No such image: mesosphere/chronos:chronos-2.5.0-0.1.20160223054243.ubuntu1404-mesos-0.27.1-2.0.226.ubuntu1404"
time="2016-05-06T09:55:30.260899011Z" level=info msg="Pull session cancelled"
time="2016-05-06T09:55:30.261186881Z" level=error msg="Error trying v2 registry: context canceled"
time="2016-05-06T09:55:30.261285274Z" level=error msg="Not continuing with pull after error: context canceled"
time="2016-05-06T09:55:34.553478416Z" level=info msg="Pull session cancelled"
time="2016-05-06T09:55:34.553706265Z" level=error msg="Error trying v2 registry: context canceled"
time="2016-05-06T09:55:34.553792796Z" level=error msg="Not continuing with pull after error: context canceled"
time="2016-05-06T09:55:35.930738808Z" level=error msg="Handler for GET /v1.23/containers/mesosphere/chronos:chronos-2.5.0-0.1.20160223054243.ubuntu1404-mesos-0.27.1-2.0.226.ubuntu1404/json returned error: No such container: mesosphere/chronos:chronos-2.5.0-0.1.20160223054243.ubuntu1404-mesos-0.27.1-2.0.226.ubuntu1404"
time="2016-05-06T09:55:35.931347065Z" level=error msg="Handler for GET /v1.23/images/mesosphere/chronos:chronos-2.5.0-0.1.20160223054243.ubuntu1404-mesos-0.27.1-2.0.226.ubuntu1404/json returned error: No such image: mesosphere/chronos:chronos-2.5.0-0.1.20160223054243.ubuntu1404-mesos-0.27.1-2.0.226.ubuntu1404"
time="2016-05-06T09:56:35.933766258Z" level=info msg="Pull session cancelled"
time="2016-05-06T09:56:35.934050301Z" level=error msg="Error trying v2 registry: context canceled"
time="2016-05-06T09:56:35.934120482Z" level=error msg="Not continuing with pull after error: context canceled"
time="2016-05-06T10:06:26.298894456Z" level=error msg="Handler for GET /v1.23/containers/mesosphere/chronos:chronos-2.5.0-0.1.20160223054243.ubuntu1404-mesos-0.27.1-2.0.226.ubuntu1404/json returned error: No such container: mesosphere/chronos:chronos-2.5.0-0.1.20160223054243.ubuntu1404-mesos-0.27.1-2.0.226.ubuntu1404"
time="2016-05-06T10:06:26.299536046Z" level=error msg="Handler for GET /v1.23/images/mesosphere/chronos:chronos-2.5.0-0.1.20160223054243.ubuntu1404-mesos-0.27.1-2.0.226.ubuntu1404/json returned error: No such image: mesosphere/chronos:chronos-2.5.0-0.1.20160223054243.ubuntu1404-mesos-0.27.1-2.0.226.ubuntu1404"
time="2016-05-06T10:07:26.422146852Z" level=info msg="Pull session cancelled"
time="2016-05-06T10:07:26.422397896Z" level=error msg="Error trying v2 registry: context canceled"
time="2016-05-06T10:07:26.422466019Z" level=error msg="Not continuing with pull after error: context canceled"

The funny thing is that marathon can finally pull the image after a while (Sometimes 30 mins, sometimes 1 hour, ... and I am not sure about this )

It happened for me a few times when I tried to pull new docker image from my private docker registry and aws ecr then I blamed the registry for this. But today I give a try on Dockerhub and it happened.

PS: I am running mesos: 0.28.1-2.0.20.ubuntu1404 on dockerhub and marathon: v1.1.1 on dockerhub.

gkleiman commented 8 years ago

The container image is fetched by the Mesos Agent, please report this on the Mesos JIRA.

pvcon13 commented 7 years ago

Where can I get help with this issue, please?

https://groups.google.com/forum/#!topic/marathon-framework/pKSPbEQmuyo

Summary. Marathon is asked to launch a docker Image. Credentials are properly supplied (uris field taregets an external api that supplies creds and/or looks up a local file.) Mesos Sandbox for Mesos Agent downloads credentials and stores in sandbox.

If the docker hub image is private and new to Marathon, it errors out. Same if it previously was cached and forceload=true is set on the marathon api call. Error is that Mesos Agent cannot find that docker image when private. Creds are what is returned when ssh to marathon host and then login to docker creates in HOME/.docker.

Marathon Version 1.1.1 (as deployed by rancher catalog -Rancher v 1.41) Docker Version 1.12.6 Mesos Version 0.28.1 (as deployed by rancher catalog -Rancher v 1.41) Ubuntu Version 16.04.01 LTS

Where can I get help for this? Its an issue between docker hub and Mesos Agent. I believe I am handling the creds in marathon properly. Need help troubleshooting. Details thread at link:

https://groups.google.com/forum/#!topic/marathon-framework/pKSPbEQmuyo

Thank you :)