deis / controller

Deis Workflow Controller (API)
https://deis.com
MIT License
41 stars 53 forks source link

Bad parameters and missing X-Registry-Auth: EOF #643

Closed bacongobbler closed 8 years ago

bacongobbler commented 8 years ago

This error only seems to occur with a Kubernetes Vagrant cluster, deploying a Dockerfile app like example-dockerfile-http or by using deis pull deis/example-go:

Step 8 : EXPOSE 80
 ---> Running in 16730b499180
 ---> f9e998101f51
Removing intermediate container 16730b499180
Successfully built f9e998101f51
pushing to registry
Traceback (most recent call last):
  File "/deploy.py", line 104, in <module>
    stream = client.push(registry+'/'+imageName, tag=imageTag, stream=True, insecure_registry=True)
  File "/usr/lib/python2.7/site-packages/docker/api/image.py", line 241, in push
    self._raise_for_status(response)
  File "/usr/lib/python2.7/site-packages/docker/client.py", line 146, in _raise_for_status
    raise errors.APIError(e, response, explanation=explanation)
docker.errors.APIError: 400 Client Error: Bad Request ("Bad parameters and missing X-Registry-Auth: EOF")
size of streamed logs 2287
Waiting for the deis/dockerbuild-www-8ce32ac6-158426f2 pod to end. Checking every 100ms for 15m0s
Done
remote: 2016/04/18 20:49:28 Error running git receive hook [Build pod exited with code 1, stopping build.]
Checking for builder pod exit code
To ssh://git@deis.10.245.1.3.xip.io:2222/www.git
 * [new branch]      master -> master
helgi commented 8 years ago

This belongs in https://github.com/deis/dockerbuilder - Which depends on docker-py 1.7.2, tho I don't see a particular fix in 1.8.0 for your issue

bacongobbler commented 8 years ago

Why does this belong in dockerbuilder? The issue stems from deis pull as well. The error is coming from the controller.

bacongobbler commented 8 years ago

Ah I see. This specific issue is coming from dockerbuilder's deploy.py, but I notice that I can reproduce this as well with deis pull deis/example-go so it's two separate components, but the same issue when pushing to the registry

helgi commented 8 years ago

Do those containers have .dockercfg or .docker/config.json? https://github.com/docker/docker-py/blob/81edb398ebf7ce5c7ef14aa0739de5329589aabe/docker/api/image.py#L168-L193 is the only place that handles this and there are cases where it is fine not to have auth config.

Makes me still think that the registry got setup in auth mode given it is coming via _raise_for_status - Direct API server error

bacongobbler commented 8 years ago

no. these containers are the same deis/example-go and deis/example-dockerfile-http examples they've always been. I have a feeling it's a kubernetes v1.2 change as this never happened before I upgraded. Good thought.

helgi commented 8 years ago

Ah, I meant the dockerbuilder and controller containers as they are the ones talking to the registry

aabed commented 8 years ago

I have the same problem, any luck with solving it?

bacongobbler commented 8 years ago

Unfortunately no. I'll likely be looking into it as we test the beta3 release.

aabed commented 8 years ago

So for now workflow doesn't work on Kube using Dockerfiles,right?

bacongobbler commented 8 years ago

For kube-vagrant, yes. It works on other deployment configurations like kube-aws and GKE however. @helgi seems to think it's because docker is set up in auth mode on vagrant, which I tend to agree.

aabed commented 8 years ago

I have my kube setup using ansible, how I can change it to install docker without auth mode

bacongobbler commented 8 years ago

I am not sure if that is the issue, nor do I know how to set that up. Based on our guesses, that would be the first place to investigate.

aabed commented 8 years ago

Then what is the auth mode exactly ?

aabed commented 8 years ago
version: 0.1
log:
  fields:
    service: registry
storage:
    cache:
        blobdescriptor: inmemory
    filesystem:
        rootdirectory: /var/lib/registry
http:
    addr: :5000
    headers:
        X-Content-Type-Options: [nosniff]
health:
  storagedriver:
    enabled: true
    interval: 10s
    threshold: 3

No auth is there
bacongobbler commented 8 years ago

I'm going to test https://github.com/docker/docker-py/pull/1038 and see if this fixes the issue.

aabed commented 8 years ago

@bacongobbler Nice, would you please update the issue with your test resutls

bacongobbler commented 8 years ago

Patching the controller with docker/docker-py#1038 worked for deis pull, so it's likely that docker introduced a regression for pushes where the daemon now sets an empty X-Registry-Auth header and docker-py just doesn't send it at all. I'll have to do more digging, but that patch worked for me. We'll have to patch both deis/dockerbuilder and deis/controller with this fix if it is indeed the root issue.

aabed commented 8 years ago

Actually, I've checked an opened issue on docker yesterday https://github.com/docker/docker/issues/10983#issuecomment-85892396 and was going to patch docker-py but it's already there, so would we wait till they merge it ?

bacongobbler commented 8 years ago

I'd wait. My debugging still doesn't explain why it only fails on vagrant. On GKE and AWS it's fine without the patch. There's something else at play here...

aabed commented 8 years ago

Are we sure it pushes to deis registry, not AWS or GCE?

bacongobbler commented 8 years ago

Are we sure it pushes to deis registry, not AWS or GCE?

Yes. I'm absolutely sure about that. We always push to our internal registry which is backed by GCE or S3. We never push to the storage backend directly.

The issue here is that kubernetes ships vagrant on fedora 23 with docker 1.9.1. Fedora uses a fork of docker which contains this bug and has been fixed in their fork... in 1.10.3. Essentially this will continue to be an issue until kubernetes bumps up to 1.10.3 or 1.11 as per https://github.com/kubernetes/kubernetes/issues/19720 and https://github.com/kubernetes/kubernetes/issues/23397.

bacongobbler commented 8 years ago

Making the following change on kubernetes 1.2.4 should fix this issue: EDIT: it does not work because fedora does not define package versions.

><> git diff cluster/saltbase/
diff --git a/cluster/saltbase/salt/docker/init.sls b/cluster/saltbase/salt/docker/init.sls
index d7291e3..112f318 100644
--- a/cluster/saltbase/salt/docker/init.sls
+++ b/cluster/saltbase/salt/docker/init.sls
@@ -23,6 +23,7 @@ bridge-utils:
 docker:
   pkg:
     - installed
+    - version: 1.9.*
   service.running:
     - enable: True
     - require:

Make sure to run make clean && make quick-release before re-provisioning your vagrant cluster.

I've also got a branch that almost provisions a debian cluster. I've just been tinkering with it on my spare time: https://github.com/kubernetes/kubernetes/compare/v1.2.4...bacongobbler:vagrant-debian

Usage is make quick-release && KUBERNETES_OS=debian KUBERNETES_PROVIDER=vagrant ./cluster/kube-up.sh

helgi commented 8 years ago

Looks like they fixed the package upstream and closed the docker-py ticket, going to close this.

Reopen if this is still a problem