docker / cli

The Docker CLI
Apache License 2.0
4.84k stars 1.9k forks source link

docker image push retry mechanism #5275

Open madhu2852 opened 1 month ago

madhu2852 commented 1 month ago

Description

A couple of months ago, I observed that Docker had a built-in retry mechanism for docker push operations, which was beneficial in handling intermittent network issues or transient errors during image pushes. However, recently I noticed that this functionality seems to be either altered or unavailable.

Could you please provide clarification on the following points:

Has there been any recent update or change to the retry mechanism in Docker image pushing operations? If the retry mechanism has been modified or removed, what is the reason behind this change? Are there any alternative recommendations or best practices for managing retry behavior during Docker image pushes? I appreciate any insights or information you can provide on this matter. If possible, I would also like to understand if there are plans to reintroduce or enhance retry capabilities in future Docker releases.

Reproduce

docker push : a

Expected behavior

docker should retry 5 times before exiting with an error.

docker version

ocker version                                                        
Client:
 Version:           27.0.3
 API version:       1.46
 Go version:        go1.21.11
 Git commit:        7d4bcd8
 Built:             Fri Jun 28 23:59:41 2024
 OS/Arch:           darwin/amd64
 Context:           desktop-linux

Server: Docker Desktop 4.32.0 (157355)
 Engine:
  Version:          27.0.3
  API version:      1.46 (minimum version 1.24)
  Go version:       go1.21.11
  Git commit:       662f78c
  Built:            Sat Jun 29 00:02:50 2024
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.7.18
  GitCommit:        ae71819c4f5e67bb4d5ae76a6b735f29cc25774e
 runc:
  Version:          1.7.18
  GitCommit:        v1.1.13-0-g58aa920
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

docker info

N/A

Additional Info

thaJeztah commented 1 month ago

Can you post the output of docker info as well? Do you have the containerd image store enabled?

(Also note that the retry happens on the daemon side, so probably better to report in https://github.com/moby/moby/issues)

That said, I'd have to check on the retry mechanism; I am aware of a retry for pulling, but not sure about pushing (but it's possible that I'd not configurable); from the dockerd daemon;

      --max-concurrent-downloads int            Set the max concurrent downloads (default 3)
      --max-concurrent-uploads int              Set the max concurrent uploads (default 5)
      --max-download-attempts int               Set the max download attempts for each pull (default 5)
madhu2852 commented 1 month ago

@thaJeztah No, the retry mechanism was available a few months back. Below is the example.

> docker push 123456789.dkr.ecr.us-west-2.amazonaws.com/myorg/myapp:latest
The push refers to repository [123456789.dkr.ecr.us-west-2.amazonaws.com/myorg/myapp]

a53c8ed5f326: Retrying in 1 second 
78e16537476e: Retrying in 1 second 
b7e38d172e62: Retrying in 1 second 
f1ff72b2b1ca: Retrying in 1 second 
33b67aceeff0: Retrying in 1 second 
c3a550784113: Waiting 
83fc4b4db427: Waiting 
e8ade0d39f19: Waiting 
487d5f9ec63f: Waiting 
b24e42eb9639: Waiting 
9262398ff7bf: Waiting 
804aae047b71: Waiting 
5d33f5d87bf5: Waiting 
4e38024e7e09: Waiting
EOF
docker info       
Client: Docker Engine - Community
 Version:    27.0.3
 Context:    desktop-linux
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.15.1-desktop.1
    Path:     /Users/xxx/.docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.28.1-desktop.1
    Path:     /Users/xxx/.docker/cli-plugins/docker-compose
  debug: Get a shell into any image or container (Docker Inc.)
    Version:  0.0.32
    Path:     /Users/xxx/.docker/cli-plugins/docker-debug
  desktop: Docker Desktop commands (Alpha) (Docker Inc.)
    Version:  v0.0.14
    Path:     /Users/xxx/.docker/cli-plugins/docker-desktop
  dev: Docker Dev Environments (Docker Inc.)
    Version:  v0.1.2
    Path:     /Users/xxx/.docker/cli-plugins/docker-dev
  extension: Manages Docker extensions (Docker Inc.)
    Version:  v0.2.25
    Path:     /Users/xxx/.docker/cli-plugins/docker-extension
  feedback: Provide feedback, right in your terminal! (Docker Inc.)
    Version:  v1.0.5
    Path:     /Users/xxx/.docker/cli-plugins/docker-feedback
  init: Creates Docker-related starter files for your project (Docker Inc.)
    Version:  v1.3.0
    Path:     /Users/xxx/.docker/cli-plugins/docker-init
  sbom: View the packaged-based Software Bill Of Materials (SBOM) for an image (Anchore Inc.)
    Version:  0.6.0
    Path:     /Users/xxx/.docker/cli-plugins/docker-sbom
  scout: Docker Scout (Docker Inc.)
    Version:  v1.10.0
    Path:     /Users/xxx/.docker/cli-plugins/docker-scout

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 1
 Server Version: 27.0.3
 Storage Driver: overlay2
  Backing Filesystem: extfs
  Supports d_type: true
  Using metacopy: false
  Native Overlay Diff: true
  userxattr: false
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: runc io.containerd.runc.v2
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: ae71819c4f5e67bb4d5ae76a6b735f29cc25774e
 runc version: v1.1.13-0-g58aa920
 init version: de40ad0
 Security Options:
  seccomp
   Profile: unconfined
  cgroupns
 Kernel Version: 6.6.32-linuxkit
 Operating System: Docker Desktop
 OSType: linux
 Architecture: x86_64
 CPUs: 12
 Total Memory: 7.656GiB
 Name: docker-desktop
 ID: xxx
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 HTTP Proxy: http.docker.internal:3128
 HTTPS Proxy: http.docker.internal:3128
 No Proxy: hubproxy.docker.internal
 Labels:
  com.docker.desktop.address=unix:///Users/xxx/Library/Containers/com.docker.docker/Data/docker-cli.sock
 Experimental: false
 Insecure Registries:
  hubproxy.docker.internal:5555
  127.0.0.0/8
 Live Restore Enabled: false
vvoland commented 1 month ago

There were no changes to the retry mechanism. However, the retry is only performed when it makes sense to. For example, push/pull won't retry after a failure with 403 Forbidden HTTP status code (as retrying will not help with user not being authorized).

Can you provide the error reported after a push that wasn't retried?

madhu2852 commented 1 month ago

@vvoland The docker image push command would retry 5 times before erroring out, saying the repository doesn't exist. Now, it simply errors out without retrying. We have custom integration in place in ECR registry to create the repository if it doesn't exist. In the past, it performed retries before exiting with any type of error; now it exits immediately.

Current behavior:

docker push xxx.dkr.ecr.us-east-1.amazonaws.com/nginx-new:test
The push refers to repository [xxx.dkr.ecr.us-east-1.amazonaws.com/nginx-new]
56b6d3be75f9: Preparing 
0c6c257920c8: Preparing 
92d0d4e97019: Preparing 
7190c87a0e8a: Preparing 
933a3ce2c78a: Preparing 
32cfaf91376f: Waiting 
32148f9f6c5a: Waiting 
name unknown: The repository with name 'nginx-new' does not exist in the registry with id 'xxx'

past behavior:

docker push <your_account_id>.dkr.ecr.<region>.amazonaws.com/nginx

The push refers to repository [xxx.dkr.ecr.us-east-1.amazonaws.com/nginx]
563c64030925: Retrying in 2 seconds 
6fb960878295: Retrying in 2 seconds 
563c64030925: Pushing [==================================================>]  7.168kB
6fb960878295: Pushing [==================================================>]   5.12kB
563c64030925: Pushed 
6fb960878295: Pushed

In both of the above scenarios, the repository in the ECR did not exist. The custom integration created a repository, and when Docker retried for the second or third time, the repository would be created, allowing Docker to push the image without any issues.

emrebdr commented 1 month ago

I checked the code, and there were no changes to the retry mechanism. I see that, as mentioned before by @.vvoland, the retry mechanism only works if the user is not authorized.

Could you please specify which version works this way? Also, please make sure user successfully authorized. 🙏

madhu2852 commented 1 month ago

Yes, I'm authenticated to the ECR registry before pushing the images.

madhu2852 commented 3 weeks ago

do you think upstream ECR api might have an effect on the retry mechanism, by any chance?

thaJeztah commented 3 weeks ago

@vvoland could this be related to the bugfix in the registry client?

That one was related to https://github.com/moby/moby/pull/45415, which revealed that a fatal (non-retriable) error fell through.

thaJeztah commented 3 weeks ago

Hmmm.. probably not; that patch already went in registry v2.8.2 (https://github.com/distribution/distribution/blob/v2.8.2/registry/client/errors.go), which was included in docker 20.10.26 already, so ... unlikely https://github.com/moby/moby/pull/45980

vvoland commented 3 weeks ago

Hmmm, could be. @madhu2852 what was the Docker version in case where it worked for you?

madhu2852 commented 3 weeks ago

@vvoland I think it was 24.x, but the strange thing is that even if I download old versions dating back to 2022/2023, it’s not retrying anymore. Could it be that the responses it was getting from the ECR upstream API caused the retries now something changed and its not getting the same response back causing the client not to retry?

thaJeztah commented 3 weeks ago

Definitely possible something changed in their registry, perhaps it didn't return correct errors before