Option to Disable Kube-proxy Deploy

bensallen commented 5 years ago

Requesting a configuration option to disable deploying kube-proxy. This will allow alternatives options to be configured.

An example is kube-router's built-in service proxy functionality which makes use of ipvs: https://www.kube-router.io/docs/how-it-works/#service-proxy-and-load-balancing.

Ref in code: https://github.com/rancher/rke/blob/master/cluster/plan.go#L68

Related to rancher/rke#312.

travisghansen commented 4 years ago

I'm also interested in this for giving cilium a try..

iwilltry42 commented 4 years ago

@travisghansen https://github.com/rancher/rke/pull/1831 I did the same thing also for cilium. Didn't push a PR though because it builds in top of the one I linked.

travisghansen commented 4 years ago

@iwilltry42 awesome! How did cilium go?

iwilltry42 commented 4 years ago

@travisghansen both kube-router and cilium worked just fine after being deployed wit RKE. We just went with kube-router after all, because we're running on an old kernel version which didn't go well with all the cilium features.

travisghansen commented 4 years ago

Just a follow-up for anyone following along. I wanted to use/trial all the fancy features of cilium like replacing kube-proxy and using DSR mode etc. I also wanted to use rke so I approached the situation slightly differently and ended up creating a wrapper kube-proxy script, copying to all host nodes and then bind mounting it into the container. Alternatively, a 'noop' image could be used to completely replace/override the kube-proxy container image. In either case rke does health checks so it needs to implement that in some shape or form until this ticket is implemented.

cluster.yml

  kubeproxy:
    # image:
    extra_env:
      - "KUBE_PROXY_NOOP=1"
      - "PATH=/opt/custom/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
    extra_binds:
      - "/opt/custom/bin:/opt/custom/bin"

kube-proxy script as placed in /opt/custom/bin

#!/bin/bash

export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin

trap "exit 0" SIGTERM SIGINT SIGHUP SIGQUIT SIGKILL

if [[ ${KUBE_PROXY_NOOP} -eq 1 ]];then
  clean-install netcat

  # mimic healthz server
  while true; do
    DATE="$(date -u "+%F %T.%N %z %Z m=+")$(perl -w -MTime::HiRes=clock_gettime,CLOCK_MONOTONIC -E 'say clock_gettime(CLOCK_MONOTONIC)')"
    CONTENT="{\"lastUpdated\": \"${DATE}\",\"currentTime\": \"${DATE}\"}"
    cat << EOF | perl -pe 'chomp if eof' | nc -s 127.0.0.1 -lp 10256 -q 1
HTTP/1.1 200 OK$(printf "\r")
Content-Type: application/json$(printf "\r")
X-Content-Type-Options: nosniff$(printf "\r")
Date: $(date -u)$(printf "\r")
Content-Length: $((${#CONTENT}+0))$(printf "\r")
$(printf "\r")
${CONTENT}
EOF

  done
else
  # /usr/local/bin/kube-proxy
  exec kube-proxy "$@"
fi

ulm0 commented 4 years ago

This should be more like Add cilium as CNI plugin because cilium can run along with kube-proxy as well, and sure most of us want a kube-proxyless k8s, but we could have different behaviors for cilium deployment within the options block

e.g.

...
network:
  plugin: cilium
  options:
    kube-proxy-replacement: "probe" # Deploys kube-proxy in ipvs mode along with cilium
    ...
...

...
network:
  plugin: cilium
  options:
    kube-proxy-replacement: "strict" # Deploys cilium only
    ...
...

travisghansen commented 4 years ago

@ulm0 I've been working with the cilium folks pretty heavily on the kube-proxy replacement. It would be tricky to have the cilium config (in the theoretical config you've provided) try and impact how kube-proxy is deployed with rke (cilium doesn't actually manage kube-proxy so the comment in your yaml would likely be confusing). But perhaps the rke folks would be interested in a bit deeper integration with cilium.

In any case certainly having the ability to disable kube-proxy regardless of cilium would be helpful. If you want to chat about the kube-proxy replacement with cilium hit me up on the cilium slack channel and I can tell you what bugs we've found and are fixing.

ulm0 commented 4 years ago

@travisghansen i was talking about a deeper integration from rke with cilium, that block of configuration was a cluster.yml and how it can deployed based on the configuration for the network plugin

In any case certainly having the ability to disable kube-proxy regardless of cilium would be helpful

yup, pretty much

I'm using rke + cilium (with kube-proxy in ipvs mode) in some clusters now, and kube-proxyless in local clusters

your wrapper looks pretty neat, will definitely give it a try

travisghansen commented 4 years ago

Yeah I understood your comment and the desire. I just don’t see the rke team building out that kind of cross logic in the app but perhaps. In any case it’s great to see more using rke + cilium! I’m pretty excited to get a couple of these bugs worked out and then really put it through some stress testing.

stale[bot] commented 4 years ago

This issue/PR has been automatically marked as stale because it has not had activity (commit/comment/label) for 60 days. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

travisghansen commented 4 years ago

Not stale.

2muchgit commented 3 years ago

Are there any updates on this? I am also very interested in this option. For multi-cluster deployments it would be very nasty to deploy a kube-proxy hack.

gioppoluca commented 2 years ago

Any news on how to disable kube-proxy in favor of a "strinct" cilium configuration?

github-actions[bot] commented 2 years ago

This repository uses an automated workflow to automatically label issues which have not had any activity (commit/comment/label) for 60 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the workflow can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the workflow will automatically close the issue in 14 days. Thank you for your contributions.

benley commented 2 years ago

(not stale)

github-actions[bot] commented 2 years ago

This repository uses an automated workflow to automatically label issues which have not had any activity (commit/comment/label) for 60 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the workflow can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the workflow will automatically close the issue in 14 days. Thank you for your contributions.

benley commented 2 years ago

still not stale

github-actions[bot] commented 2 years ago

This repository uses an automated workflow to automatically label issues which have not had any activity (commit/comment/label) for 60 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the workflow can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the workflow will automatically close the issue in 14 days. Thank you for your contributions.

benley commented 2 years ago

is this actually implemented or did it just get closed as stale?

PhilipSchmid commented 1 year ago

FYI: If you are a Cilium user who would like to use its Kube-Proxy-Replacement (KPR) mode configured to strict (Helm value: kubeProxyReplacement: strict), there's no urgent need to remove Kube-Proxy from your RKE1 cluster. Cilium KPR (strict) can co-exist with Kube-Proxy on the same cluster and simply overrules all Kube-Proxy functionality. However, please be aware that this could lead to potential communication issues on existing nodes which already had workload running on them as both mechanisms operate independent of each other. Nevertheless, a Node restart should be enough so solve this. Statement from the docs:

Careful: When deploying the eBPF kube-proxy replacement under co-existence with kube-proxy on the system, be aware that both mechanisms operate independent of each other. Meaning, if the eBPF kube-proxy replacement is added or removed on an already running cluster in order to delegate operation from respectively back to kube-proxy, then it must be expected that existing connections will break since, for example, both NAT tables are not aware of each other. If deployed in co-existence on a newly spawned up node/cluster which does not yet serve user traffic, then this is not an issue.

Regards, Philip

MrBlaise commented 1 week ago

Can this be reopened?

rancher / rke

Option to Disable Kube-proxy Deploy #1432