docker / cli

The Docker CLI
Apache License 2.0
4.84k stars 1.91k forks source link

Slow docker commands with: credsStore and credHelpers #1591

Open pecigonzalo opened 5 years ago

pecigonzalo commented 5 years ago

Description

Using credHelpers and credsStore with EG: ecr-login helper from AWS causes a big slowdown on docker build and other commands as it tries to log in to the registry every time, when sometimes credentials are not available for this.

Steps to reproduce the issue:

  1. Configure ~/.docker/config.json EG:
    
    {
    "credsStore": "secretservice",
    "credHelpers": {
        "123456.dkr.ecr.REGION.amazonaws.com": "ecr-login"
    }
    }
2. Ensure you have no AWS credentials (we use aws-vault to load creds only when needed)
3. Setup a `Dockerfile` that DOES NOT use the registry. EG:

FROM openjdk:8-jre-slim ENV JAVA_OPTS -XX:+CMSClassUnloadingEnabled -XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap -XshowSettings:vm WORKDIR /app

4. Run a `docker build ` against said Dockerfile

**Describe the results you received:**

The build runs but only after `ecr-login` times out because there are no creds for it. I can confirm this is the case from the `ecr-login` helper logs:

time="2018-12-28T17:11:39+01:00" level=debug msg="Could not fetch credentials for cache prefix, disabling cache" error="NoCredentialProviders: no valid providers in chain. Deprecated.\n\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors" time="2018-12-28T17:11:39+01:00" level=debug msg="Retrieving credentials" region=eu-central-1 registry=123123 serverURL=123123.dkr.ecr.eu-central-1.amazonaws.com time="2018-12-28T17:11:39+01:00" level=debug msg="Calling ECR.GetAuthorizationToken" registry=123123 time="2018-12-28T17:12:00+01:00" level=error msg="Error retrieving credentials" error="ecr: Failed to get authorization token: NoCredentialProviders: no valid providers in chain. Deprecated.\n\tFor verbose messaging see aws.Config.CredentialsChainVerboseErrors"


Example run **with** `secretservice` example (or removing the `credsStore` line):

❯ time docker -D build -t test . Sending build context to Docker daemon 2.048kB Step 1/4 : FROM openjdk:8-jre-slim ---> 30bf70ee624c Step 2/4 : ENV DEBIAN_FRONTEND noninteractiveback ---> Using cache ---> cc55395868a9 Step 3/4 : ENV JAVA_OPTS -XX:+CMSClassUnloadingEnabled -XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap -XshowSettings:vm ---> Using cache ---> a9c8de304b1f Step 4/4 : WORKDIR /app ---> Using cache ---> a45bdd081e42 Successfully built a45bdd081e42 Successfully tagged test:latest docker -D build -t test . 0.05s user 0.04s system 0% cpu 40.856 total

Example run **with** `pass` example (or removing the `credHelpers` section):

❯ time docker -D build -t test . Sending build context to Docker daemon 2.048kB Step 1/4 : FROM openjdk:8-jre-slim ---> 30bf70ee624c Step 2/4 : ENV DEBIAN_FRONTEND noninteractiveback ---> Using cache ---> cc55395868a9 Step 3/4 : ENV JAVA_OPTS -XX:+CMSClassUnloadingEnabled -XX:+UnlockExperimentalVMOptions -XX:+UseCGroupMemoryLimitForHeap -XshowSettings:vm ---> Using cache ---> a9c8de304b1f Step 4/4 : WORKDIR /app ---> Using cache ---> a45bdd081e42 Successfully built a45bdd081e42 Successfully tagged test:latest docker -D build -t test . 0.01s user 0.05s system 37% cpu 0.167 total


as you can see, both ran from cache, the wait time is on contacting the registries.

PS: I tried using the `-D` and `-l "debug"` to get more "verbose" output, but I cant find any additional log.

**Describe the results you expected:**

Since nothing is using the `credHelpers` specified registry and according to this code: https://github.com/docker/cli/blob/00e684311892ad63a8367066c6df4a5aa642o6507/cli/config/configfile/file.go#L289
it should not even try to execute the `ecr-login` helper and only use the default or the one specified on `credsStore` or that is my understanding from the docs linked below.

Additionally, if I set `"credsStore": "pass",` it works as expected, which is rather odd as `pass` is not even installed in my computer (helper nor CLI) and as expected `docker login` fails. Altho this might be a different bug.

I believe the culprit is https://github.com/docker/cli/blob/ea836abed5ba9c62c3d4444ea2a6bbf9b486ef1a/cli/command/image/build.go#L386 that calls `GetAllCredentials` instead of the creds required for the build only.

**Additional information you deem important (e.g. issue happens only occasionally):**

https://github.com/docker/cli/blob/254bcd27661e67707ac61f9532950e8ddec618a8/docs/reference/commandline/login.md

**Output of `docker version`:**

❯ docker version
Client: Version: 18.09.0-ce API version: 1.39 Go version: go1.11.2 Git commit: 4d60db472b Built: Fri Nov 9 00:05:34 2018 OS/Arch: linux/amd64 Experimental: false

Server: Engine: Version: 18.09.0-ce API version: 1.39 (minimum version 1.12) Go version: go1.11.2 Git commit: 4d60db472b Built: Fri Nov 9 00:05:11 2018 OS/Arch: linux/amd64 Experimental: false


**Output of `docker info`:**

❯ docker info
Containers: 0 Running: 0 Paused: 0 Stopped: 0 Images: 17 Server Version: 18.09.0-ce Storage Driver: overlay2 Backing Filesystem: extfs Supports d_type: true Native Overlay Diff: false Logging Driver: json-file Cgroup Driver: cgroupfs Plugins: Volume: local Network: bridge host macvlan null overlay Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog Swarm: inactive Runtimes: runc Default Runtime: runc Init Binary: docker-init containerd version: 9f2e07b1fc1342d1c48fe4d7bbb94cb6d1bf278b.m runc version: 079817cc26ec5292ac375bb9f47f373d33574949 init version: fec3683 Security Options: seccomp Profile: default userns Kernel Version: 4.19.12-zen1-1-zen Operating System: Arch Linux OSType: linux Architecture: x86_64 CPUs: 8 Total Memory: 15.39GiB Name: smallfish ID: 6ISZ:WJFT:HQSZ:FXNB:EONI:RBIR:GKS4:5GWK:W4RY:WUJW:CCIT:5NPH Docker Root Dir: /var/lib/docker/1000.1000 Debug Mode (client): false Debug Mode (server): false Registry: https://index.docker.io/v1/ Labels: Experimental: false Insecure Registries: 127.0.0.0/8 Live Restore Enabled: false



**Additional environment details (AWS, VirtualBox, physical, etc.):**
- Physical
- ECR Registry
malachantrio commented 4 years ago

One thing to mention that I've noticed....

We also use aws-vault to switch between accounts (config that logs us into one account and role switch into the test/prod) what this means is that we get a new AWS_ACCESS_KEY_ID each time we run aws-vault which means every time it logs in to ECR the cache key that is generated will never be used again. looking at the code the cache key seems to be

"--"

So unless you have your credentials always available and you arent assuming a role then you won't ever hit the cache. I think as is the credential helper wasn't really designed with anything other than static credentials in mind. I believe there are other issues raised regarding MFA which probably fall under a similar area where certain authentication methods for AWS aren't 100% catered for

malachantrio commented 4 years ago

I guess it's a quirk of how ECR and AWS work. The cache needs to be keyed on your credentials not on you per se. As an example if the cache was only keyed on region and registry id then you could get into the situation where you initially log in with read credentials and it caches the token, then you want to change to use higher write credentials, you would be locked out from switching credentials until the cache expiry (obviously if you delete entries from cache then it would work).

In summary getting an auth token from ECR is not enough. you need an auth token and the same AWS credentials the token was generated with to be able to be granted access via cache. I imagine there would need a fairly major re-think to allow only needing the credentials when you generate the token