imjasonh commented 5 years ago

This method employs a somewhat silly hack to determine reliably whether a keychain can be used to authorize pushes to a repo. This can be useful if you want to check whether you can push an image before you spend lots of effort constructing the layers for that image.

@jonjohnsonjr has some ideas about how we can make this unnecessary by allowing users to call remote.Write before all the layer contents are available and mutate.Appended in, by passing stream.Layers to it that are populated in goroutines and closed when they're done. This relies on a change to how remote.Write, mutate.Append and stream.Layer fit together.

I'll let him elaborate.

jonjohnsonjr commented 5 years ago

I had two ideas for this. One is simple, the other requires a few fixes.

Simple idea

We create a registry client package that maps directly to the distribution spec. These would be lower-level operations, not like our higher level remote package. The remote package would use this registry client instead of making http requests directly.

People who need the functionality that remote.CheckPushPermission provides could just call InitiateUpload directly.

Complex idea

TL;DR: we need to do everything lazily instead of eagerly. stream.Layer is a start down this path.

There are a lot of things we need to fix, that could be broken out into separate issues, but lumping them all in here will make it easier to understand the problem end-to-end, I think.

What needs fixing, in order:

stream.Layer's Size, Digest, and DiffID methods should block until the stream has been consumed instead of returning stream.ErrNotComputed. stream.NewLayer should take the lock and compressedReader.Close should release it. This means that you'll basically always need to use a goroutine to use stream.Layer effectively (kind of like io.Pipe), which we should document (or at least provide examples).
tarball.LayerFromOpener should use stream.Layer as most of its implementation. Right now, we compute the diffid and digest as soon as we read it, which we should be doing lazily. (For already-compressed layers, we will have to do something else -- computing the DiffID is a pain for those.)
mutate.image.Layers should first look through its Addendums to see if there are any streaming layers instead of relying on compute() to return stream.ErrNotComputed.
remote.Write should always upload the config file in a separate goroutine concurrently with the layers. Because we made stream.Layer's Size, Digest, and DiffID block instead of returning an error, the config upload will sit patiently in a goroutine waiting on any stream.Layers to complete their uploads before continuing. In the case where the image doesn't have any stream.Layers, it will just upload immediately. This simplifies the remote.Write code a lot.

example

Let's look at a code snippet from kaniko to see what's wrong and how this would be better: https://github.com/GoogleContainerTools/kaniko/blob/246cc92a33c29d76b811f7da605375d8bc4c2198/pkg/executor/push.go#L110-L144

today

We open up a layer using tarball.LayerFromFile. Currently, this computes the Digest, Size, and DiffID immediately.

After that, we initate a push, which can fail if we don't have valid credentials.

tomorrow

When we open up the layer using tarball.LayerFromFile, it will use a stream.Layer instead, so the Digest/Size/DiffID computations will be deferred until the layer is consumed.

We'll immediately start the image upload, which will fail when we try to initiate a blob upload, which happens before we even try to open the layer's io.ReadCloser.

Thus, we never even try to open the file.

followup

Most of the mutate operations are unfortunately synchronous. E.g. they update the image immediately and don't wait lazily for it to be consumed. It's possible that we could introduce a conceptual "thunk" in here such that all these operations are deferred until the corresponding method is actually called again, i.e. we need to make all of those work with stream.Layer, though it may be impossible.

I'm not sure if we can do this without breaking clients, but I think it's worth doing even if it's hard.

github-actions[bot] commented 4 years ago

This issue is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Reopen the issue with /reopen. Mark the issue as fresh by adding the comment /remove-lifecycle stale.

google / go-containerregistry

Remove the need for remote.CheckPushPermission #412