Open jonjohnsonjr opened 3 years ago
Dear @jonjohnsonjr - thx for reaching out (+ sry for my late response)
1. It certainly please you to learn that I (albeit for different reasons) already changed to "two-step" monolithic uploads [0]. You are correct - since the spec allowed both forms, I reckoned it would be an optimisation to omit the initial request.
2. Already done, too :-)
3. Valid point (I so far was too lazy, though..) - still an open TODO
4. Fun-fact: I found out (the hard way) that different OCI registry implementations behave differently if the manifest's mimetype is not properly set (e.g. in the
accept
-header). Also, different registry-implementations (I am talking about "docker-hub", GCR, AliCloud, Nexus, Artifactory..) behave rather differently here. By now, I think I should be rather close to an implementation that will work for all of them.5. will do :-)
As for containerregistry-lib: I spent several hours of debugging / monkey-patching (and in fact also created a fork). At some point (unfortunately, rather late..) I decided to implement a custom oci-client. Streaming support is not my only concern, though.
We plan to use OCI-Registries not only for storing "docker-compatible" virtualisation container images, but also as a general-purpose-blob-store. That being said, we obviously benefit from a low-level API-Client and want to explicitly have access to the individual blobs and manifests.
One additional use-case I am not certain could be easily achieved using the containerregistry-lib I used before would be identical oci-artifact-replication (docker pull + push will for example always modify the cfg-blob).
Streaming support is also an issue (+ I really do not like the approach of using named temporary files). In addition, there are some other use-cases, for example filtering out some files from layer-blobs, which may be a strange use-case as first glance, which I could implement on top of my custom client :-)
For the CICD environment I have developed this client for, having a python-implementation is very beneficial. I think a colleague of mine does use some golang-lib, though (/cc @schrodit )
[0] https://github.com/gardener/cc-utils/commit/fcc77aacf79f77ad3b5beccbbc1478c6dff6302c
It certainly please you to learn that I (albeit for different reasons) already changed to "two-step" monolithic uploads
Already done, too :-)
will do :-)
🎉 🎉 🎉
Thanks!
Fun-fact: I found out (the hard way)
Yeah, this is really frustrating. It is really hard to write a client that works against even just the major registries.
One additional use-case I am not certain could be easily achieved using the containerregistry-lib I used before would be identical oci-artifact-replication (docker pull + push will for example always modify the cfg-blob).
This was actually one of the first use cases for the containerregistry library, so it should already work. We do something similar in go-containerregistry (exposed in crane as crane cp
).
For the CICD environment I have developed this client for, having a python-implementation is very beneficial.
Yep, this makes sense. There was a similar constraint that led to containerregistry being written in python originally, so I understand the appeal.
Thanks again for fixing my main concerns!
This is kind of a meta-issue about several registry-related concerns. I can file separate issues if that's helpful.
I stumbled across this repo while doing some investigation around deprecating part of the distribution spec, which is primarily why I'm filing this issue, but I figured I'd point out some other things I noticed.
I see that you're using github.com/google/containerregistry for some things and have a custom client for other things. I'm guessing this is because containerregistry fails for large blobs when it buffers everything into memory? Are there any other reasons? If you are intent on continuing to use python, I think what you're doing is reasonable, but if the language is negotiable, there is a maintained successor to containerregistry at github.com/google/go-containerregistry, which would save you some trouble.
Some implementation details that could be improved:
HEAD /v2/.../blobs/<digest>
) before uploading blobs. Doing that should save a lot of work, most of the time.Manifest
dropdown), it seems like the media types are getting mangled. This isn't a huge problem, but I can see it causing issues for lots of clients.Again, I mostly care about that first point, the rest are mostly just me trying to be helpful :slightly_smiling_face: I can send a PR if you'd like but it might be easier for you to do it yourself? Also, have you looked into how hard it would be to patch containerregistry to support streaming blobs? That might be easier, but probably not.