nlewo / nix2container

An archive-less dockerTools.buildImage implementation
Apache License 2.0
501 stars 45 forks source link

copyToRegistry not documented? #128

Open the-sun-will-rise-tomorrow opened 5 months ago

the-sun-will-rise-tomorrow commented 5 months ago

Hi! I see copyToRegistry mentioned in the benchmark section of the README, but this (seemingly very useful!) function doesn't seem to be documented.

nlewo commented 4 months ago

No, it is not documented and tbh, i didn't expect to see these functions used as they are!

A documentation PR would be welcomed!

the-sun-will-rise-tomorrow commented 4 months ago

and tbh, i didn't expect to see these functions used as they are!

Maybe I misunderstood something then - I thought one of the strengths of this package (and that function) is to be able to build a container image bypassing the local podman store and send that to a remote registry. Did I miss a more proper way to do so?

nlewo commented 4 months ago

Maybe I misunderstood something then - I thought one of the strengths of this package (and that function) is to be able to build a container image bypassing the local podman store and send that to a remote registry. Did I miss a more proper way to do so?

No, it is currently the proper way to do it.

the-sun-will-rise-tomorrow commented 1 month ago

Thank you. I finally got to playing with this for a bit. I got it to work, but unfortunately it doesn't seem to be working as well as I thought it would.

First of all, even for small changes that should change only the last layer, I'm seeing a lot of lines such as INFO[0012] Adding 1 paths to layer (size:123456789 digest:sha256:abcd.........). Granted, this work seems to be completed quickly, but I don't understand why it has to be re-done every time.

But more importantly, the skopeo copy command still seems to upload all layers every time. I'm seeing the same Copying blob sha256:abcd..... lines every time, and this step takes several minutes to complete every time. Even when running copyToRegistry again on the same expression a few minutes later, it still seems to upload all layers every time.

I don't fully understand what is happening, but I found this:

https://github.com/containers/image/pull/536

It looks like, in order for Skopeo to be able to reliably skip layers that are already present on the target registry, it has to maintain a persistent local cache of what layers it has already uploaded?

This doesn't work for me, because I'm running builds in a distributed CI environment.

I tried running with --debug, and I see a mix of time="..." level=debug msg="... already exists" and time="..." level=debug msg="... not present". I also tried --dest-precompute-digests which didn't seem to change much in terms of speed, but I do see some time="..." level=debug msg="Compressing blob on the fly"., so it does seem to do something.

I am wildly guessing here, but could it be that in order to copy the image to a registry, layers must be compressed, and nix2container is calculating the hashes of non-compressed layers, therefore resulting in this disconnect and inefficiency? Would it make sense for nix2container to optionally "pre-compress" the layers, and use the hashes of the compressed layers?