distribution / distribution

The toolkit to pack, ship, store, and deliver container content
https://distribution.github.io/distribution
Apache License 2.0
9k stars 2.5k forks source link

Pass thru cache of Dockerhub public images - docker login works, but docker pull gives 401 response #4501

Open zfLQ2qx2 opened 2 weeks ago

zfLQ2qx2 commented 2 weeks ago

Description

This has me completely stumped - I've had a working pass-thru cache to Dockerhub for several years, then it stopped working.

My symptoms are that docker login is successful, and I can use the token from the auth server to access the catalog, but if I try to pull anything not cached then I get an unauthorized response. I've verified that my dockerhub username at PAT are correct with the https://hub.docker.com/v2/users/ URL - and I tried generating a new one also. The images I'm trying to pull are common public things like nginx and alpine linux - I can pull them directly with no issue. I've verified the proxy credentials are set in the environment - and I also tried hard coding them. The logs confirm I am configured to be a cache and the 401 response but don't hint at the issue. I've tried two versions of the registry from 2022 and Oct 2023 - I'm working on an auth server that can give RFC 7638 style tokens, but until thats done I need a docker registry that supports the libtrust style kid fields in the JWT.

I'm not sure what else I can do to diagnose the underlying issue.

Reproduce

I'm hoping for some method of getting more information from the server.

Expected behavior

No response

registry version

2.8.x from 10/23, custom build

Additional Info

No response

zfLQ2qx2 commented 1 week ago

Here is a thought, I'm betting Dockerhub is using the v3 docker-registry - any chance they can't authenticate to each other?

milosgajdos commented 1 week ago

Without providing logs, specific versions, etc we can't even follow up on this issue I'm afraid.

zfLQ2qx2 commented 3 days ago

@milosgajdos Ok, let me see if I can create a test case for you, I need to strip out all of the references to my employer before I send anything.

zfLQ2qx2 commented 3 days ago

@milosgajdos Ok, one of my workarounds to try and get past this issue is to move to docker-registry v3. In v3 of docker-registry the JWT implementation was switched from libtrust to go-jose. The method of generating the kid field of JWT tokens switched from the libtrust method to the RFC 7638 method.

My resulting token looks like this (when decoded):

{
  "typ": "JWT",
  "alg": "RS256",
  "kid": "u7fKjfiz4XydeA-Ad3tSIuOw3roJUb48QndibYEi0sE"
}
{
  "iss": "ACME auth server",
  "sub": "zfLQ2qx2",
  "aud": "Docker registry",
  "exp": 1731015527,
  "nbf": 1731014617,
  "iat": 1731014627,
  "jti": "1142751695239731298"
}

Here is a link to a golang playground snippet that shows go-jose can parse the token just fine:

https://go.dev/play/p/k5zCX2wjJQt

However the docker-registry doesn't like it. If I try a simple test case like:

curl -H "Authorization: Bearer ${TOKEN}" https://<server>/v2/_catalog

I get a 401 response and in the docker-registry logs I see:

time="2024-11-07T21:09:55.404329555Z" level=warning msg="error authorizing context: malformed token" go.version=go1.23.2 http.request.host=<server> http.request.id=4f0321c4-261b-4bd3-883c-4339f5e22b99 http.request.method=GET http.request.remoteaddr="<removed>" http.request.uri=/v2/_catalog http.request.useragent=curl/8.6.0 instance.id=3978d39b-f1d9-4b25-83a8-cfabb7910c33 version=v3.0.0-beta.1.m+unknown

Good news is that its no longer complaining about the key id, bad news is there is not any indication what is considered malformed. I suspect there is a claim missing but I have no idea which one. The documentation doesn't reflect any of the changes between v2 and v3, so I can only speculate. I have no examples of what a valid v3 token looks like, I could not find any examples of anyone giving an auth response without a libtrust style kid field.