cue-labs / oci

Go modules related to OCI (Open Container Initiative) registries
Apache License 2.0
22 stars 3 forks source link

ociclient.makeError obscures StatusCode in a basic stringified error unless registry response is specifically-crafted JSON #26

Closed tianon closed 3 months ago

tianon commented 4 months ago

I don't know if it's intentional (it seems like it's not, given how the function is structured), but ociclient.makeError does a lot of return fmt.Errorf(...) for many cases that seem like they should be returning a wrapped wireError, wireErrors, or ociregistry.Error instead so consumers can still access the original status code.

https://github.com/cue-labs/oci/blob/1f693b7466afbd2989a1bdbc88b4c33e5c550472/ociregistry/ociclient/error.go#L127-L172

I don't think there's anything in the distribution spec that mandates the structure of non-200 status codes from a registry, but I might be very wrong there! (from what I can see, this matches the implementation over in the server half of this library, but I'm using ociclient against non-cue-labs registries :innocent:)

The specific case I'm running into is a request from a registry that's returning a 404, but it becomes a basic fmt.Errorf("404 Not Found: ...") (with the added bug that I think my registry must be returning a "JSON-y" Content-Type: value because the ociclient.makeError code is trying to Unmarshal it :sweat_smile:), so I can't identify/adjust my client behavior based on the 404 without doing a string match, which isn't ideal (and seems counter to the design of ociregistry.ErrNameUnknown and friends, and even the wireError/wireErrors wrappers that allow direct access to the HTTP status code).

At first, I thought the Errors slice on wrapErrors was generic enough that I would've proposed moving up the var errs wireErrors (and population of httpStatusCode on it) to the top of the function, and then every return fmt.Errorf(...) instead be something like errs.Errors = append(errs.Errors, fmt.Errorf(...)); return &errs or something (perhaps even the HEAD error translation code for the !isJSONMediaType(resp.Header.Get("Content-Type")) case, maybe with a second error added to the Errors slice if you wanted to still include the errant body), but I see it's not a generic errors wrapper (which is fair, since it then wouldn't Unmarshal correctly), so I'm not sure what to propose here except that it'd be really nice if there were some way to access StatusCode on errors from the registry in ociclient.

rogpeppe commented 4 months ago

Yup. I've definitely been thinking that the errors support needs to be improved. Thanks for creating the issue!

Some random thoughts:

tianon commented 4 months ago

I don't think there's anything in the distribution spec that mandates the structure of non-200 status codes from a registry, but I might be very wrong there!

Just to correct myself here -- this is related to https://github.com/opencontainers/distribution-spec/pull/178 + https://github.com/opencontainers/distribution-spec/pull/280 (ala https://github.com/opencontainers/distribution-spec/issues/443).

If we go back in time a bit to before the spec got "cleaned up", we get:

https://github.com/opencontainers/distribution-spec/blob/a6e5b091b1468662730ab1e5be55c61838643ab4/spec.md#on-failure-bad-request

(Which clearly describes the structured errors that ociclient is reading/parsing :smile: :heart:)

All that got removed before 1.0 of the distribution-spec, for reasons that apparently weren't made clear in writing anywhere (as seen in issue 443 there).

mvdan commented 4 months ago

Should the spec add those structured errors back, or should we remove support for them if the spec dropped them on purpose?

tianon commented 4 months ago

IMO the spec should add them back, even if only as informational/SHOULD/MAY since they are in common use and documenting common use for interoperability is the literal point of the spec existing in the first place, but I'm not a maintainer of the distribution spec (only image and runtime). :see_no_evil:

However, @jonjohnsonjr there is a maintainer of the distribution spec, and I don't think I'm putting words in his mouth if I say he does think the spec would have value in having things like those structured errors back (https://github.com/opencontainers/distribution-spec/pull/496 being some pretty compelling evidence towards that -- presumably closed because he was just restoring previously in-the-spec content and didn't think it should have too many cycles spent nit-picking, but that's definitely me projecting / trying to read between the lines). :heart: