docker / cli

The Docker CLI
Apache License 2.0
4.91k stars 1.93k forks source link

Docker Push re-uploads existing layers #1490

Open hassanselim0 opened 5 years ago

hassanselim0 commented 5 years ago

Description

When docker push is checking each layer whether it already exists or not, if a that request times out due to connection issues, it will assume the layer needs to be uploaded and proceeds to do so. This causes a lot of time waste, and in some cases a waste of money (on limited/metered data connections).

Steps to reproduce the issue:

  1. Build a docker image that has a lot of layers (in my case: 18)
  2. Push the image to a docker registry
  3. Change a few files and rebuild (with caching) such that only the last few layers have been modified
  4. Connect to an unstable internet connection that is prone to packet loss and timeouts (eg: a weak 3G data connection)
  5. Push the image again to the same registry

You can even skip step 3. The image can be completely unchanged and the issue would still happen.

Describe the results you received: In a lot of cases, some of the layers would be re-uploaded even though they already exist on the registry.

Describe the results you expected: If ducker push fails to determine if a layer already exists or not, it should reattempt that check instead of assuming that it doesn't exist and re-uploading.

Additional information you deem important (e.g. issue happens only occasionally):

Output of docker version:

Client:
 Version:           18.06.1-ce
 API version:       1.38
 Go version:        go1.10.3
 Git commit:        e68fc7a
 Built:             Tue Aug 21 17:21:34 2018
 OS/Arch:           windows/amd64
 Experimental:      false

Server:
 Engine:
  Version:          18.06.1-ce
  API version:      1.38 (minimum version 1.12)
  Go version:       go1.10.3
  Git commit:       e68fc7a
  Built:            Tue Aug 21 17:29:02 2018
  OS/Arch:          linux/amd64
  Experimental:     true
 Kubernetes:
  Version:          v1.10.5
  StackAPI:         Unknown

Output of docker info:

Containers: 43
 Running: 37
 Paused: 0
 Stopped: 6
Images: 103
Server Version: 18.06.1-ce
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
 Volume: local
 Network: bridge host ipvlan macvlan null overlay
 Log: awslogs fluentd gcplogs gelf journald json-file logentries splunk syslog
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 468a545b9edcd5932818eb9de8e72413e616e86e
runc version: 69663f0bd4b60df09991c08812a60108003fa340
init version: fec3683
Security Options:
 seccomp
  Profile: default
Kernel Version: 4.9.93-linuxkit-aufs
Operating System: Docker for Windows
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 1.934GiB
Name: linuxkit-00155d4078dd
ID: RZPU:K4Q5:W7JN:4Q5T:G2KN:L4HA:FI4G:65LJ:WEBH:JHBQ:BHEJ:WNOV
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): true
 File Descriptors: 128
 Goroutines: 133
 System Time: 2018-10-29T22:15:49.0770882Z
 EventsListeners: 1
Registry: https://index.docker.io/v1/
Labels:
Experimental: true
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

Additional environment details (AWS, VirtualBox, physical, etc.):

thaJeztah commented 5 years ago

/cc @dmcgowan @stevvooe

hassanselim0 commented 5 years ago

Any plans for this? @thaJeztah @dmcgowan @stevvooe

thaJeztah commented 5 years ago

If I'm not mistaken, this is part of the distribution specification/protocol. In order to push an image, the daemon/client that's pushing the image must have proof that it has access to the layers that are referenced in the manifest, so what happens is;

If this check was not done; someone could get access to someone else's layers, just by crafting a manifest that lists the layers (e.g., someone posted a manifest of a private image; I copy that manifest, and push it to Docker Hub as my own; now I have access to the layers)

hassanselim0 commented 5 years ago

While I'm not fully aware of what goes on behind the scenes when pushing an image. What I notice is that some layers that are large (like 200 MBs) would usually just say <layer_id>: Layer already exists very quickly (much faster than the time needed to upload 200 MBs on my connection).

My issue is that when my connection isn't perfect (eg: some packet loss), some layers that I know exist would re-upload instead of skipping that process. What I'm suggesting is to retry the initial check for layer existence in case of timeouts (and possibly other unexpected errors). I know that these layers exist because they have the same ID, and cancelling the upload midway and rerunning the docker push command manually would sometimes lead to the desired result (Layer already exists).

kdubuc commented 5 years ago

I'm just experimenting this. When I ran "docker push REGISTRY:5000/IMAGE" first time, I get some "Layer already exists", and "Retrying in X seconds" a lot. My bandwidth is low, so i can explain this. But, the second time I ran the command, all layers was pushed in a second. Strange behavior.