kubernetes / kubernetes

Production-Grade Container Scheduling and Management
https://kubernetes.io
Apache License 2.0
110.2k stars 39.43k forks source link

API proxy strips URL query params when proxying websocket connections #77142

Open thediveo opened 5 years ago

thediveo commented 5 years ago

Tested on Kubernetes 1.14 (minikube 1.0). To reproduce:

  1. deploy pod with a websocket server running:

    kubectl run wsserver --generator=run-pod/v1 --rm -i --tty --image ubuntu:disco -- bash -c \
      "apt-get update && apt-get install -y wget && \
      wget https://github.com/vi/websocat/releases/download/v1.4.0/websocat_1.4.0_ssl1.1_amd64.deb && \
      dpkg -i webso*.deb && \
      websocat -vv -s 0.0.0.0:8000"
  2. in host shell outside minikube run the following websocat command to connect to the websocket server in your wsserver pod:

    websocat --binary --ws-c-uri=wss://192.168.99.100:8443/api/v1/namespaces/default/pods/wsserver:8000/proxy/test?foo=what - ws-c:cmd:'socat - ssl:192.168.99.100:8443,verify=0,cafile=$HOME/.minikube/ca.crt,cert=$HOME/.minikube/client.crt,key=$HOME/.minikube/client.key'
  3. check the output of the server-side websocat and notice that the query params are missing: [DEBUG websocat::ws_server_peer] Incoming { version: Http11, subject: (Get, AbsolutePath("/test")), headers: ...

  4. exec into your pod with the websocket server, and issue a local websocket connect.

    kubectl exec wsserver websocat ws://localhost:8000/test?foo=what
  5. check output of the service-side websocat, now the query params are present as to be expected: [DEBUG websocat::ws_server_peer] Incoming { version: Http11, subject: (Get, AbsolutePath("/test?foo=what")), headers: ...

(note: updated instructions to be much simpler now)

thediveo commented 5 years ago

/sig api-machinery

fedebongio commented 5 years ago

/assign @cheftako

fejta-bot commented 5 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

fejta-bot commented 5 years ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle rotten

thediveo commented 5 years ago

/remove-lifecycle rotten

fejta-bot commented 4 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

thediveo commented 4 years ago

/remove-lifecycle stale

This is still an issue even with 1.16.

fejta-bot commented 4 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

thediveo commented 4 years ago

/remove-lifecycle stale

Still an issue.

fejta-bot commented 4 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

thediveo commented 4 years ago

/remove-lifecycle stale

tarokkk commented 4 years ago

Is there anything we can help about this issue?

fejta-bot commented 4 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

thediveo commented 4 years ago

/remove-lifecycle stale

fejta-bot commented 3 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

thediveo commented 3 years ago

/remove-lifecycle stale

fejta-bot commented 3 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale

thediveo commented 3 years ago

/remove-lifecycle stale

kwiesmueller commented 3 years ago

This is still an issue and I ran into it as well today. @cheftako you're assigned to it, are you aware of this/working on it?

fejta-bot commented 3 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale

thediveo commented 3 years ago

/remove-lifecycle stale

k8s-triage-robot commented 3 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

thediveo commented 3 years ago

/remove-lifecycle stale

mertyildiran commented 3 years ago

@TheDiveO I'm having the same exact issue. I reproduced it in isolation with this repository. My private image has the same issue with the same exact proxy code.

The query string is disappearing within tryUpgrade method.

I'm fairly certain that Connection: Upgrade header is causing the query string to disappear.

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

thediveo commented 2 years ago

/remove-lifecycle stale

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

thediveo commented 2 years ago

/remove-lifecycle stale

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

thediveo commented 2 years ago

/remove-lifecycle stale

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

thediveo commented 2 years ago

/remove-lifecycle stale

aojea commented 2 years ago

is this still present in current stable versions? https://github.com/kubernetes/kubernetes/issues/105105#issuecomment-922365380 can you reproduce it in current HEAD?

thediveo commented 2 years ago

@aojea It is still present in k8s v1.25.0 (deployed through minikube). The query options still get stripped.

Do you have concrete reason to believe that this was somehow fixed after v1.25.0, and without referencing this open issue?

thediveo commented 2 years ago

To reproduce with a recent version of Minikube/k8s/Ubuntu, here the updated instructions:

kubectl run wsserver --rm -i --tty --image ubuntu:jammy -- bash -c   "apt-get update && apt-get install -y wget && \
  wget https://github.com/vi/websocat/releases/download/v1.9.0/websocat_linux64 && \
  chmod u+x ./websocat_linux64 && ./websocat_linux64 -vv -s 0.0.0.0:8000"

See how the socat ws server doesn't get the query params passed:

./websocat --binary --ws-c-uri=wss://192.168.49.2:8443/api/v1/namespaces/default/pods/wsserver:8000/proxy/test?foo=what - ws-c:cmd:'socat - ssl:192.168.49.2:8443,verify=0,cafile=$HOME/.minikube/ca.crt,cert=$HOME/.minikube/profiles/minikube/client.crt,key=$HOME/.minikube/profiles/minikube/client.key'
k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

thediveo commented 1 year ago

/remove-lifecycle stale

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

thediveo commented 1 year ago

/remove-lifecycle stale

k8s-triage-robot commented 1 year ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

thediveo commented 1 year ago

/remove-lifecycle stale

seriously, this bug is not going to puff off into thin air because some bot bots around.

k8s-triage-robot commented 8 months ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

thediveo commented 8 months ago

/remove-lifecycle stale

k8s-triage-robot commented 5 months ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

thediveo commented 5 months ago

/remove-lifecycle stale

aojea commented 5 months ago

one naive question, why is the query params relevant if AFAIK they are not used in any place?

thediveo commented 5 months ago

Finally I can give more background information since Siemens opened its Edgeshark project for capturing container traffic from stand-alone container hosts as well as k8s nodes.

Captured packets are streamed via a websocket pcapng stream. In order to establish the correct stream, some parameters need to be passed to the our packet capture streaming service ("packetflix") on the host where the target container is located. In case of k8s the plugin connecting Wireshark with the node's streaming service goes through the k8s API, using its built-in proxy functionality.

And as we found out some years ago and reported here, the k8s API proxy verb strips URL query params instead of correctly forwarding them to the service.

aojea commented 5 months ago

Ok, so this only happens with websockets urls 🤔 , normal request are working ok

kubectl run webapp --command --image=registry.k8s.io/e2e-test-images/agnhost:2.47 -- /agnhost netexec --http-port=8080

kubectl get --raw http://1/api/v1/namespaces/default/pods/webapp:8080/proxy/echo?msg=echoed_msg -v7
I0422 13:17:39.333422 3767678 loader.go:373] Config loaded from file:  /usr/local/google/home/aojea/.kube/config
I0422 13:17:39.333989 3767678 round_trippers.go:463] GET https://127.0.0.1:34563/api/v1/namespaces/default/pods/webapp:8080/proxy/echo?msg=echoed_msg
I0422 13:17:39.334006 3767678 round_trippers.go:469] Request Headers:
I0422 13:17:39.334015 3767678 round_trippers.go:473]     Accept: application/json, */*
I0422 13:17:39.334023 3767678 round_trippers.go:473]     User-Agent: kubectl/v1.26.15 (linux/amd64) kubernetes/aae4c29
I0422 13:17:39.336562 3767678 round_trippers.go:574] Response Status: 200 OK in 2 milliseconds
echoed_msg

based on this https://github.com/kubernetes/kubernetes/pull/120290 it seems it get lost during the upgrade

Is the apiserver doing the upgrade to websockets or is passing it through?

cc: @liggitt @seans3

liggitt commented 5 months ago

The proxy handlers are set up here:

Those only preserve the path from the incoming request, construct the target location within the kubelet handler, and construct a UpgradeAwareHandler with AppendLocationPath and UseRequestLocation set to false.

Handling for non-upgrade requests looks like this and explicitly copies the RawQuery from the request:

Handling for upgrade requests looks like this and does not use any part of the request location unless UseRequestLocation is true:

That's why query params get dropped for upgraded requests when proxying through the API server.

liggitt commented 5 months ago

So... now that we know why the issue occurs ... what can we do about it?

It would be easy enough to just modify the upgradeaware.go upgrade-handling path to copy over RawQuery the same way, but we'd have to be exceptionally careful about the following things:

Another possibility would be to copy over the query from the request to the location before constructing the upgradeawareproxy (in the core//rest/ handlers). That appeals to me more, since we can reason about those and know none of them are setting query params, and we'd avoid changing low-level upgradeaware.go behavior for all callers.