containers / podman

Podman: A tool for managing OCI containers and pods.
https://podman.io
Apache License 2.0
23.47k stars 2.38k forks source link

Podman behind forward proxy problem. Any requirements like which http headers must not be altered/removed? #11993

Closed borazem closed 2 years ago

borazem commented 3 years ago

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description

When I tried to to pull images from quay.io or docker.io with podman behind the proxy I am experience errors and I suspect that the forward proxy may be changing http headers.

is the any list of requirements which header fields must not be altered for podman to work?

the podman can login successfully to docker.io and quay.io.

Steps to reproduce the issue:

steps can not be reproduced as the proxy vendor and settings info are unknown. however the steps done are:

  1. login to quay.io or docker.io

  2. run the commands: # podman pull quay.io/openshift-release-dev/ocp-release@sha256:5f57680f3fc9632b0e6db4a9a8804d347108e96c38bc769b03e4158576035895 or # podman pull docker.io/borazem/boa-nodejsexpress

Describe the results you received: example 1 (quay.io):

# podman pull quay.io/openshift-release-dev/ocp-release@sha256:5f57680f3fc9632b0e6db4a9a8804d347108e96c38bc769b03e4158576035895

Trying to pull quay.io/openshift-release-dev/ocp-release@sha256:5f57680f3fc9632b0e6db4a9a8804d347108e96c38bc769b03e4158576035895...
 Manifest does not match provided manifest digest sha256:5f57680f3fc9632b0e6db4a9a8804d347108e96c38bc769b03e4158576035895
Error: error pulling image "quay.io/openshift-release-dev/ocp-release@sha256:5f57680f3fc9632b0e6db4a9a8804d347108e96c38bc769b03e4158576035895": unable to pull quay.io/openshift-release-dev/ocp-release@sha256:5f57680f3fc9632b0e6db4a9a8804d347108e96c38bc769b03e4158576035895: unable to pull image: Error determining manifest MIME type for docker://quay.io/openshift-release-dev/ocp-release@sha256:5f57680f3fc9632b0e6db4a9a8804d347108e96c38bc769b03e4158576035895: Manifest does not match provided manifest digest sha256:5f57680f3fc9632b0e6db4a9a8804d347108e96c38bc769b03e4158576035895

example2 (docker.io) # podman pull docker.io/borazem/boa-nodejsexpress

Trying to pull docker.io/borazem/boa-nodejsexpress...
 invalid character '<' looking for beginning of value
Error: error pulling image "docker.io/borazem/boa-nodejsexpress": unable to pull docker.io/borazem/boa-nodejsexpress: unable to pull image: Error initializing image from source docker://borazem/boa-nodejsexpress:latest: invalid character '<' looking for beginning of value

Describe the results you expected:

I would expect the images would download.

Additional information you deem important (e.g. issue happens only occasionally):

image image image

Output of podman version:

[core@bmceocpb0 ~]$ podman version
Version:      3.0.2-dev
API Version:  3.0.0
Go Version:   go1.15.13
Built:        Tue Jun  8 07:52:06 2021
OS/Arch:      linux/amd64

Output of podman info --debug:

(paste your output here)

Package info (e.g. output of rpm -q podman or apt list podman):

podman-3.0.1-7.module+el8.4.0+11311+9da8acfb.x86_64

Have you tested with the latest version of Podman and have you checked the Podman Troubleshooting Guide? (https://github.com/containers/podman/blob/master/troubleshooting.md)

Yes, I checked Podman Troubleshooting Guide No, I am not sure if the used podman is of the latest version

Additional environment details (AWS, VirtualBox, physical, etc.): Environment VMWare vSphere.

mheon commented 2 years ago

Manifest does not match provided manifest digest sha256:5f57680f3fc9632b0e6db4a9a8804d347108e96c38bc769b03e4158576035895

Interesting.

@mtrmac @vrothberg PTAL

mtrmac commented 2 years ago

is the any list of requirements which header fields must not be altered for podman to work?

The protocol is documented at https://github.com/distribution/distribution/blob/main/docs/spec/api.md .

Describe the results you received: example 1 (quay.io):

# podman pull quay.io/openshift-release-dev/ocp-release@sha256:5f57680f3fc9632b0e6db4a9a8804d347108e96c38bc769b03e4158576035895

Trying to pull quay.io/openshift-release-dev/ocp-release@sha256:5f57680f3fc9632b0e6db4a9a8804d347108e96c38bc769b03e4158576035895...
 Manifest does not match provided manifest digest sha256:5f57680f3fc9632b0e6db4a9a8804d347108e96c38bc769b03e4158576035895
Error: error pulling image "quay.io/openshift-release-dev/ocp-release@sha256:5f57680f3fc9632b0e6db4a9a8804d347108e96c38bc769b03e4158576035895": unable to pull quay.io/openshift-release-dev/ocp-release@sha256:5f57680f3fc9632b0e6db4a9a8804d347108e96c38bc769b03e4158576035895: unable to pull image: Error determining manifest MIME type for docker://quay.io/openshift-release-dev/ocp-release@sha256:5f57680f3fc9632b0e6db4a9a8804d347108e96c38bc769b03e4158576035895: Manifest does not match provided manifest digest sha256:5f57680f3fc9632b0e6db4a9a8804d347108e96c38bc769b03e4158576035895

To directly go after the error message, it would be necessary to capture the HTTP response (including the body), maybe using Wireshark, and then try re-computing the digest (or just read the response if it is clearly not an image manifest).

(I can, at least, confirm that pulling that image directly from Quay does work.)

This could happen for quite a few reasons; the proxy modifying the contents for whatever reason, or maybe corrupt storage on a transparent cache, or something like that.

example2 (docker.io) # podman pull docker.io/borazem/boa-nodejsexpress

Trying to pull docker.io/borazem/boa-nodejsexpress...
 invalid character '<' looking for beginning of value
Error: error pulling image "docker.io/borazem/boa-nodejsexpress": unable to pull docker.io/borazem/boa-nodejsexpress: unable to pull image: Error initializing image from source docker://borazem/boa-nodejsexpress:latest: invalid character '<' looking for beginning of value

This is more suggestive — I’d pretty much bet on that proxy injecting a HTML… something (either an error page or a login form) instead of just forwarding the request/response as we’d assume.

The debug log does list the HTTP verbs and URLs being involved; for starters, try accessing the last one (the one that ends with …/manifests/latest) with something like curl and see what the response says.

That is not certain to work (notably just blindly using curl does not contain the right authentication headers), so it might be necessary to fall back to the Wireshark approach, but as a first cheap triage option it might well reveal something useful.

borazem commented 2 years ago

Thank you very much @mtrmac will try to get some further information related to that.

borazem commented 2 years ago

@mtrmac, I reviewed the suggested protocol and tried to play around with the Postman and curl code in code snapin I can successfully get tag list for https://hub.docker.com/v2/repositories/bodemo/boa-nodejsexpress/tags *the encoded authorization string is fake

curl --location --request GET 'https://hub.docker.com/v2/repositories/bodemo/boa-nodejsexpress/tags' \
--header 'Host: hub.docker.com' \
--header 'Authorization: Basic Ym9kZW1vDjhqVXYuKipPUB5CR1IzYw=='
{"count":1,"next":null,"previous":null,"results":[{"creator":14134331,"id":173548815,"image_id":null,"images":[{"architecture":"amd64","features":"","variant":null,"digest":"sha256:d6d003abdc49882a9fdd34a5850b60d68bebbea6cce7df9a72010d257cd6a4d5","os":"linux","os_features":"","os_version":null,"size":76764279,"status":"active","last_pulled":"2021-10-22T13:51:31.535928Z","last_pushed":"2021-10-22T13:51:31.116002Z"}],"last_updated":"2021-10-22T13:51:31.116002Z","last_updater":14134331,"last_updater_username":"bodemo","name":"latest","repository":15660498,"full_size":76764279,"v2":true,"tag_status":"active","tag_last_pulled":"2021-10-22T13:51:31.535928Z","tag_last_pushed":"2021-10-22T13:51:31.116002Z"}]}

however, I can not get any information about the manifests or blobs which should be like

GET /v2/<name>/manifests/<reference>
Host: <registry host>
Authorization: <scheme> <token>

so I tried

curl --location --request GET 'https://hub.docker.com/v2/repositories/bodemo/boa-nodejsexpress/manifests' \
--header 'Host: hub.docker.com' \
--header 'Authorization: Basic Ym9kZW1vDjhqVXYuKipPUB5CR1IzYw=='

but only get <h1>Not Found</h1><p>The requested URL was not found on this server.</p>

Any hint? would curl command with -vvv switch help identify altered or restrained http headers? I guess not without working other commands like get manifests and blobs as they return additional http headers like media type... which seem to be the current issue here:

Trying to pull quay.io/openshift-release-dev/ocp-release@sha256:5f57680f3fc9632b0e6db4a9a8804d347108e96c38bc769b03e4158576035895...
  Manifest does not match provided manifest digest sha256:5f57680f3fc9632b0e6db4a9a8804d347108e96c38bc769b03e4158576035895
Error: error pulling image "quay.io/openshift-release-dev/ocp-release@sha256:5f57680f3fc9632b0e6db4a9a8804d347108e96c38bc769b03e4158576035895": unable to pull quay.io/openshift-release-dev/ocp-release@sha256:5f57680f3fc9632b0e6db4a9a8804d347108e96c38bc769b03e4158576035895: unable to pull image: Error determining manifest MIME type for docker://quay.io/openshift-release-dev/ocp-release@sha256:5f57680f3fc9632b0e6db4a9a8804d347108e96c38bc769b03e4158576035895: Manifest does not match provided manifest digest sha256:5f57680f3fc9632b0e6db4a9a8804d347108e96c38bc769b03e4158576035895

curl command with -vvv

root@cdautil:~ $ curl --location --request GET 'https://hub.docker.com/v2/repositories/bodemo/boa-nodejsexpress/tags' \
> --header 'Host: hub.docker.com' \
> --header 'Authorization: Basic Ym9kZW1vDjhqVXYuKipPUB5CR1IzYw==' -vvv
Note: Unnecessary use of -X or --request, GET is already inferred.
* Uses proxy env variable no_proxy == '127.0.0.1,localhost,cda.internal,.cda.internal,apps.cda.internal,.apps.cda.internal,.benoibm.com,benoibm.com,.bmocp.benoibm.com,bmocp.benoibm.com,.apps.bmocp.benoibm.com,apps.bmocp.benoibm.com'
*   Trying 3.217.79.149...
* TCP_NODELAY set
* Connected to hub.docker.com (3.217.79.149) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
*   CAfile: /etc/pki/tls/certs/ca-bundle.crt
  CApath: none
* TLSv1.3 (OUT), TLS handshake, Client hello (1):
* TLSv1.3 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server did not agree to a protocol
* Server certificate:
*  subject: CN=*.docker.com
*  start date: Jun 28 00:00:00 2021 GMT
*  expire date: Jul 27 23:59:59 2022 GMT
*  subjectAltName: host "hub.docker.com" matched cert's "*.docker.com"
*  issuer: C=US; O=Amazon; OU=Server CA 1B; CN=Amazon
*  SSL certificate verify ok.
> GET /v2/repositories/bodemo/boa-nodejsexpress/tags HTTP/1.1
> Host: hub.docker.com
> User-Agent: curl/7.61.1
> Accept: */*
> Authorization: Basic Ym9kZW1vDjhqVXYuKipPUB5CR1IzYw==
> 
< HTTP/1.1 200 OK
< date: Fri, 22 Oct 2021 16:33:50 GMT
< content-type: application/json
< content-length: 711
< x-ratelimit-limit: 180
< x-ratelimit-reset: 1634920490
< x-ratelimit-remaining: 180
< server: nginx
< x-frame-options: deny
< x-content-type-options: nosniff
< x-xss-protection: 1; mode=block
< strict-transport-security: max-age=31536000
< 
{"count":1,"next":null,"previous":null,"results":[{"creator":14134331,"id":173548815,"image_id":null,"images":[{"architecture":"amd64","features":"","variant":null,"digest":"sha256:d6d003abdc49882a9fdd34a5850b60d68bebbea6cce7df9a72010d257cd6a4d5","os":"linux","os_features":"","os_version":null,"size":76764279,"status":"active","last_pulled":"2021-10-22T13:51:31.535928Z","last_pushed":"2021-10-22T13:51:31.116002Z"}],"last_updated":"2021-10-22T13:51:31.116002Z","last_updater":14134331,"last_updater_username":"bodemo","name":"latest","repository":15660498,"full_size":76764279,"v2":true,"tag_status":"active","tag_last_pulled":"2021-10-22T13:51:31.535928Z","tag_last_pushed":"2021-10-22T13:51:31.116002Z"}]}
* Connection #0 to host hub.docker.com left intact
mtrmac commented 2 years ago
curl --location --request GET 'https://hub.docker.com/v2/repositories/bodemo/boa-nodejsexpress/manifests' \

That’s not the right path, per the protocol; it’s not even the right hostname.

The --debug log quoted in the original report shows the relevant URL.


would curl command with -vvv switch help identify altered or restrained http headers?

curl to a proxy has, in principle no way to tell what the proxy does with the received requests (i.e. whether it sends it anywhere at all, and if so, how is it modified).

borazem commented 2 years ago

Thank you @mtrmac,

In regards to

would curl command with -vvv switch help identify altered or restrained http headers?

I expressed myself incorrectly. I meant if I run that curl command without proxy and then with proxy and compare the results, will I be clear about what has been changed by the proxy.

In regards to the wrong path: I think I got it now with some hints from https://gist.github.com/alexanderilyin/8cf68f85b922a7f1757ae3a74640d48a. If I run the podman pull docker.io/bodemo/boa-nodejsexpress --log-level debug I would get the right hostnames. Besides I see that basic authentication would be enough for querying tags but would not be enough for getting manifests and blobs as not only authentication but also authorization is needed to access that content.

so, for querying the manifests I would need to:

export DOCKER_HUB_ORG=bodemo
export DOCKER_HUB_REPO=boa-nodejsexpress
export DOCKER_HUB_USER=bodemo
export DOCKER_HUB_PASSWORD='sfasdfasf'
export DOCKER_HUB_HOST="hub.docker.com"

and run the command

curl --location --request GET "https://${DOCKER_HUB_HOST}/v2/repositories/${DOCKER_HUB_ORG}/${DOCKER_HUB_REPO}/tags" \
--header "Host: ${DOCKER_HUB_HOST}" \
--header "Authorization: Basic $(echo "${DOCKER_HUB_ORG}:${DOCKER_HUB_PASSWORD}" | base64 -w0)" -vvv |jq

with beautify result I can find the image diggest

"images": [
{
    "architecture": "amd64",
    "features": "",
    "variant": null,
    "digest": "sha256:d6d003abdc49882a9fdd34a5850b60d68bebbea6cce7df9a72010d257cd6a4d5",

With the image digest I would export the following additional variables with the image digest that I got in previous step as the last variable.

export AUTH_DOMAIN="auth.docker.io"
export AUTH_SERVICE="registry.docker.io"
export AUTH_SCOPE="repository:${DOCKER_HUB_ORG}/${DOCKER_HUB_REPO}:pull"
export AUTH_OFFLINE_TOKEN="1"
export AUTH_CLIENT_ID="shell"
export API_DOMAIN="registry-1.docker.io"
export IMAGE_DIGEST="sha256:d6d003abdc49882a9fdd34a5850b60d68bebbea6cce7df9a72010d257cd6a4d5"

I can now run the following command:

curl --location --request GET "https://${API_DOMAIN}/v2/${DOCKER_HUB_ORG}/${DOCKER_HUB_REPO}/manifests/${IMAGE_DIGEST}/" \
--header "Authorization: Bearer $(curl -u ${DOCKER_HUB_USER}:${DOCKER_HUB_PASSWORD} "https://${AUTH_DOMAIN}/token?service=${AUTH_SERVICE}&scope=${AUTH_SCOPE}"|jq -r '.token')" -vvv |jq

in beautified json part of response I could get the config as well as layers digests:

{
    "schemaVersion": 2,
    "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
    "config": {
    "mediaType": "application/vnd.docker.container.image.v1+json",
    "size": 11082,
    "digest": "sha256:73303f37c24204ba82edd00b9d46f6603d3cc83ca03849d752fffe0c55aebf2e"
    },
    "layers": [
    {
        "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
        "size": 22524572,
        "digest": "sha256:d599a449871ee73b960e80d176b989365dfecb4c8f337bf21e8853862403ee9b"
    },
    {
        "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
        "size": 4148,
        "digest": "sha256:44394e7946d13db960c03fa2fd77aff18aab149c2175a019f4224438506ad266"
    },

if I then further export the content digest

export CONFIG_DIGEST="sha256:73303f37c24204ba82edd00b9d46f6603d3cc83ca03849d752fffe0c55aebf2e"

I can get the image config information from the blob store with the following command:

curl --location --request GET "https://${API_DOMAIN}/v2/${DOCKER_HUB_ORG}/${DOCKER_HUB_REPO}/blobs/${CONFIG_DIGEST}/" \
--header "Authorization: Bearer $(curl -u ${DOCKER_HUB_USER}:${DOCKER_HUB_PASSWORD} "https://${AUTH_DOMAIN}/token?service=${AUTH_SERVICE}&scope=${AUTH_SCOPE}"|jq -r '.token')" -vvv

similarly if I export the variable with the layer digest and file name I can save the layer into a local file with the command below:

export LAYER_DIGEST="sha256:d599a449871ee73b960e80d176b989365dfecb4c8f337bf21e8853862403ee9b"
export LAYER_FILE="test.layer"
curl --location --request GET "https://${API_DOMAIN}/v2/${DOCKER_HUB_ORG}/${DOCKER_HUB_REPO}/blobs/${LAYER_DIGEST}/" \
--header "Authorization: Bearer $(curl -u ${DOCKER_HUB_USER}:${DOCKER_HUB_PASSWORD} "https://${AUTH_DOMAIN}/token?service=${AUTH_SERVICE}&scope=${AUTH_SCOPE}"|jq -r '.token')" -vvv -o ${LAYER_FILE}

if I compare the file size with the layer size provided for the layer in the image blob query we can see the size of the file is the same as provided for the layer in the image info.

-rw-r--r--.  1 root root 22524572 Oct 25 19:26 test.layer
"layers": [
{
    "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
    "size": 22524572,
"digest": "sha256:d599a449871ee73b960e80d176b989365dfecb4c8f337bf21e8853862403ee9b"

With all these -vvv outputs I hope I have all ready to run the same commands behind the proxy and compare results and see if something was altered or omitted.

Would that be it?

And thank you very much for your help

mtrmac commented 2 years ago

I meant if I run that curl command without proxy and then with proxy and compare the results, will I be clear about what has been changed by the proxy.

I don’t see how that could be possible, nor how that question differs from the question I answered.

borazem commented 2 years ago

OK then, but thanks anyway. :-)

rhatdan commented 2 years ago

Where are we with this, is this still an issue or can it be closed?

borazem commented 2 years ago

We can say it is not a bug for sure. @mtrmac shared the information about the Docker protocol, so it is possible to determine what are http headers needed. I tried to get some further information through curl steps that would help me identify what http content was exchanged for some of activities.

However, it may be cool if the debug would provide even further information that you can get with the curl -vvv. But that would be more as enhancement request. What do you think about sense of further debugging info?

But in any case we can close the issue.

mtrmac commented 2 years ago

However, it may be cool if the debug would provide even further information that you can get with the curl -vvv.

Eventually we’ll hopefully get https://github.com/containers/image/pull/201 over the finish line. Of course that would still not be able to show how the proxy modified the request when passing it on, if it did.