xelalexv / dregsy

Keep container registries in sync
https://buymeacoffee.com/xelalex
Apache License 2.0
259 stars 52 forks source link

Arch Issue with Image syncing #117

Closed prasoon-pxc closed 5 months ago

prasoon-pxc commented 5 months ago

======Scenario===========

Questions

xelalexv commented 5 months ago

If you want to sync all architectures contained in a multi-platform image, use platform: all in your mapping. More details are documented here.

prasoon-pxc commented 5 months ago

thanks @xelalexv , I had tried this option but got some error, But I think it was more of rate-limit. Today I have enabled trace logging and got some insight on it, I will try to add this option again and let you know how this goes.

rate-limit-error

time="2024-04-05T06:58:53Z" level=debug msg="time=\"2024-04-05T06:58:53Z\" level=fatal msg=\"initializing source docker://cturra/ntp:latest: reading manifest latest in docker.io/cturra/ntp: toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit\""time="2024-04-05T06:58:53Z" level=error msg="exit status 1"
prasoon-pxc commented 5 months ago

@xelalexv I was able to sync all arch image using platform: all but in ecr all image digest is showing as a separate image and some of it also have size of '0' image

moreover, everytime when I run this dregsy tool it pulls all images first and then check if image tag is already exists, I think it will take effect on rate-limiting thing also

xelalexv commented 5 months ago

@xelalexv I was able to sync all arch image using platform: all but in ecr all image digest is showing as a separate image and some of it also have size of '0'

I think this is an issue with the AWS ECR UI, and your multi-arch image is actually synced correctly.

moreover, everytime when I run this dregsy tool it pulls all images first and then check if image tag is already exists, I think it will take effect on rate-limiting thing also

Could you provide a log at debug level for this?

prasoon-pxc commented 5 months ago

@xelalexv --> In Below Example:

ime="2024-04-09T05:22:38Z" level=debug msg="Getting image source signatures"time="2024-04-09T05:22:38Z" 
level=debug msg="Copying blob sha256:9864598njksdfjkherfhuyiertihjfdiuhgtirt"time="2024-04-09T05:22:38Z" 
level=debug msg="Copying blob sha256:78453784hjdfjhbjfbh784bjhfdsuy347834jnsdc"time="2024-04-09T05:22:38Z" 
level=debug msg="Copying blob sha256:fjghbvruitdeh85437792456yhci132xc8"time="2024-04-09T05:22:38Z" 
level=debug msg="Copying config sha256:25647899359ujvnkcn5iow498352nmcf4iwojre6"time="2024-04-09T05:22:38Z" 
level=debug msg="Writing manifest to image destination"

level=debug msg="ensuring target exists" ref=5847934509546.dkr.ecr.eu-central-1.amazonaws.com/myapp/thirdparty/grafana/loki-canarytime="2024-04-09T05:22:37Z" level=info msg="ECR target already exists

I think first it is downloading image from source then it is checking if image tag is already present in destination, this triggers image pull everytime from dockerhub

xelalexv commented 5 months ago

Could you point me to the source image?

prasoon-pxc commented 5 months ago

have lot of images but for ex: grafana-loki-canary

xelalexv commented 5 months ago

The Skopeo copy command, which is leveraged for the actual image copying when using the Skopeo relay, downloads the image layers only on first sync, and not again on subsequent syncs. The output may be a bit misleading there. What still gets downloaded each time are the manifests for each included platform. I checked download traffic and number of https GET request for first and second sync of this example image:

run GET manifest GET blob download
1 4 12 34677 KB
2 4 3 87 KB

So, nothing to worry about.

BTW, it is not necessary to pull an image in order to determine available tags. The tags list can be directly retrieved from the registry API.

prasoon-pxc commented 5 months ago

okay @xelalexv , thanks for your prompt reply, In past when I run this dregsy job multiple times then I got the rate limit error from dockerhub but we can close this issue for now and I will keep an eye on it.

xelalexv commented 5 months ago

Note that DockerHub rate limits count number of pulls, based on manifest GET requests, rather than downloaded bytes:

  • A pull request is defined as up to two GET requests on registry manifest URLs (/v2//manifests/).
  • A normal image pull makes a single manifest request.
  • A pull request for a multi-arch image makes two manifest requests.

So for our example above, first and subsequent sync runs do not differ in terms of above rate limits, i.e. both contribute equally to the usage. I recommend using a DockerHub account for syncing. The free tier already doubles the limit compared to anonymous access.