Open hassanselim0 opened 5 years ago
/cc @dmcgowan @stevvooe
Any plans for this? @thaJeztah @dmcgowan @stevvooe
If I'm not mistaken, this is part of the distribution specification/protocol. In order to push an image, the daemon/client that's pushing the image must have proof that it has access to the layers that are referenced in the manifest, so what happens is;
If this check was not done; someone could get access to someone else's layers, just by crafting a manifest that lists the layers (e.g., someone posted a manifest of a private image; I copy that manifest, and push it to Docker Hub as my own; now I have access to the layers)
While I'm not fully aware of what goes on behind the scenes when pushing an image.
What I notice is that some layers that are large (like 200 MBs) would usually just say <layer_id>: Layer already exists
very quickly (much faster than the time needed to upload 200 MBs on my connection).
My issue is that when my connection isn't perfect (eg: some packet loss), some layers that I know exist would re-upload instead of skipping that process. What I'm suggesting is to retry the initial check for layer existence in case of timeouts (and possibly other unexpected errors).
I know that these layers exist because they have the same ID, and cancelling the upload midway and rerunning the docker push
command manually would sometimes lead to the desired result (Layer already exists
).
I'm just experimenting this. When I ran "docker push REGISTRY:5000/IMAGE" first time, I get some "Layer already exists", and "Retrying in X seconds" a lot. My bandwidth is low, so i can explain this. But, the second time I ran the command, all layers was pushed in a second. Strange behavior.
Description
When docker push is checking each layer whether it already exists or not, if a that request times out due to connection issues, it will assume the layer needs to be uploaded and proceeds to do so. This causes a lot of time waste, and in some cases a waste of money (on limited/metered data connections).
Steps to reproduce the issue:
You can even skip step 3. The image can be completely unchanged and the issue would still happen.
Describe the results you received: In a lot of cases, some of the layers would be re-uploaded even though they already exist on the registry.
Describe the results you expected: If ducker push fails to determine if a layer already exists or not, it should reattempt that check instead of assuming that it doesn't exist and re-uploading.
Additional information you deem important (e.g. issue happens only occasionally):
Output of
docker version
:Output of
docker info
:Additional environment details (AWS, VirtualBox, physical, etc.):