opencontainers / distribution-spec

OCI Distribution Specification
https://opencontainers.org
Apache License 2.0
781 stars 201 forks source link

Add OCI-Referrers header #463

Open sudo-bmitch opened 10 months ago

sudo-bmitch commented 10 months ago

This adds an optional header that servers can use to inform clients about the lack of a referrers listing for a given manifest, enabling clients to skip some requests.

Related to #454.

sudo-bmitch commented 9 months ago

If we know they don't have reference we also know whether they do.

@mikebrow Since this is a "MAY", the absence of the header needs to be treated as if referrers could exist. For a client that is pulling referrers along with the rest of the image, the workflow is the same if there was an OCI-Referrers: present header as if it didn't exist. Are there use cases where adding that header would change the client behavior? If we can't find use cases today but discover them later, we could also add this to the spec later.

From the previous discussions and linked issue, the workflow we were considering for referrers that exists is to add a Link header giving clients a direct URL to pull the referrers from a CDN. The Link header would avoid querying the referrers API to get the 307 (redirect) to the CDN location. I'm avoiding closing the linked #454 with this PR so the Link header can be added in a separate PR.

For the "point in time", that applies to every referrers API request. That would make more sense in documenting the referrers API itself, and not the manifest GET response.

mikebrow commented 9 months ago

If we know they don't have reference we also know whether they do.

@mikebrow Since this is a "MAY", the absence of the header needs to be treated as if referrers could exist. For a client that is pulling referrers along with the rest of the image, the workflow is the same if there was an OCI-Referrers: present header as if it didn't exist. Are there use cases where adding that header would change the client behavior? If we can't find use cases today but discover them later, we could also add this to the spec later.

isn't it more expensive to force all clients to assume existence every time make the extra referrers requests?

From the previous discussions and linked issue, the workflow we were considering for referrers that exists is to add a Link header giving clients a direct URL to pull the referrers from a CDN. The Link header would avoid querying the referrers API to get the 307 (redirect) to the CDN location. I'm avoiding closing the linked #454 with this PR so the Link header can be added in a separate PR.

fair to say a link header would be one way/mode to provide a positive indicator.. still not sure why avoiding link or positive notice for local is encouraging clients to not call referrers, when no negative response header is provided.. though I do expect clients to configure yes/no look for artifacts even if no response header is provided indicating artifact existence..

For the "point in time", that applies to every referrers API request. That would make more sense in documenting the referrers API itself, and not the manifest GET response.

the lack of subject refers to the manifest would be a point in time notice.. not following why making that point here and in the referrers api will not make sense to the reader...

mikebrow commented 9 months ago

client1.. pull image with unpack... trace log includes that artifacts have been reported for the following:

client1.. reports upstream the notified existence of artifacts for...

upstream client.. pulls said artifacts for reported.. applying upstream filters etc..

upstream client requests client1 run the container

sudo-bmitch commented 9 months ago

isn't it more expensive to force all clients to assume existence every time make the extra referrers requests?

It's very expensive and will be at least half of the referrers requests for clients that implement recursive queries, which is why this PR is here. But making the opposite assumption, that a lack of the header means there are no referrers, cannot be safely done.

We have the following decision logic:

The client behavior between the 4 and 5 is the same. And a client must run the referrers query if there is no header to handle scenarios 2 and 3. Effectively when the client doesn't get an absent header, it may be 4b, 3, or 2, and the client doesn't know until it runs the referrers query.

Is there a scenario I'm missing where adding the present header would help clients?

michaelb990 commented 7 months ago

I would prefer not encouraging registries to do more (potentially unnecessary) work on any of the APIs in the "pull" workflow. Some clients will need this information, but (IMO) most won't, so it seems like our goal should be to keep the manifest GET API should be as simple as possible.

sudo-bmitch commented 7 months ago

I would prefer not encouraging registries to do more (potentially unnecessary) work on any of the APIs in the "pull" workflow. Some clients will need this information, but (IMO) most won't, so it seems like our goal should be to keep the manifest GET API should be as simple as possible.

@michaelb990 agreed, that's why this was only a MAY. If/when registries think that the added work will reduce their overhead, they can add it. And until then, there won't be any header so clients that need to pull referrers will always query for them.

jonjohnsonjr commented 6 months ago

I would prefer not encouraging registries to do more (potentially unnecessary) work on any of the APIs in the "pull" workflow. Some clients will need this information, but (IMO) most won't, so it seems like our goal should be to keep the manifest GET API should be as simple as possible.

Agree with this.

How do we feel about a client header that positively indicates that the client wants referrers information? Then the registry doesn't have to guess about whether it's "worth" calculating this header for the client.

For clients that don't care about referrers, the registry doesn't have to do any additional work to query referrers.

sudo-bmitch commented 6 months ago

How do we feel about a client header that positively indicates that the client wants referrers information? Then the registry doesn't have to guess about whether it's "worth" calculating this header for the client.

For clients that don't care about referrers, the registry doesn't have to do any additional work to query referrers.

@jonjohnsonjr @michaelb990 I've updated with the client header. Let me know if that covers your concerns.

@mikebrow have your concerns above been addressed?

avtakkar commented 6 months ago

We are leaning towards not implementing this header in Azure Container Registry due to the added processing cost of consistently calculating this value.

sudo-bmitch commented 6 months ago

Based on the Jan 11th call, we're postponing any decision to 1.2

mikebrow commented 6 months ago

if not included in OCI 1.1 clients and the registries can proof of concept the header optimization ...