Closed karras closed 3 years ago
Working nodePort
example service, tested on K8s 1.18, 1.19 and 1.20:
apiVersion: v1
kind: Service
metadata:
name: consul-client
namespace: consul
spec:
type: NodePort
selector:
component: client
ports:
- name: http
protocol: TCP
port: 8501
nodePort: 8501
targetPort: 8501
- name: grpc
protocol: TCP
port: 8502
nodePort: 8502
targetPort: 8502
topologyKeys:
- "kubernetes.io/hostname"
Steps to get it running on Kubernetes v1.21.1
using TopologyAwareHints
:
TopologyAwareHints
on kube-apiserver
, kube-controller
& kube-proxy
.kubectl label node <nodename> topology.kubernetes.io/zone=nodeX
(Zonename can be chosen freely, but every node needs to be a seperate zone)nodePort
service for the Consul clients with the annotation service.kubernetes.io/topology-aware-hints: auto
Working nodePort
example service, tested on K8s (rancher) 1.21.1:
apiVersion: v1
kind: Service
metadata:
name: consul-client
namespace: consul
annotations:
service.kubernetes.io/topology-aware-hints: auto
spec:
type: NodePort
selector:
component: client
ports:
- name: http
protocol: TCP
port: 8501
nodePort: 8501
targetPort: 8501
- name: grpc
protocol: TCP
port: 8502
nodePort: 8502
targetPort: 8502
Issues we encountered:
We activated the FeatureGate also on kubelet
& scheduler
, this resulted in the EndpointSlices
not getting any hints and routing not properly working. Removing the FeatureGate on those components fixed this issue.
When everything is working, you should get a similar output from kubectl -n consul get endpointslices consul-client-xxxxx -o yaml
:
addressType: IPv4
apiVersion: discovery.k8s.io/v1
endpoints:
- addresses:
- 10.42.2.12
conditions:
ready: true
hints:
forZones:
- name: node2
nodeName: node02
targetRef:
kind: Pod
name: consul-ffl7f
namespace: consul
resourceVersion: "26388"
uid: 8982f868-09e7-4b7f-93da-a80cfd1d5563
zone: node2
- addresses:
- 10.42.0.8
conditions:
ready: true
hints:
forZones:
- name: node1
nodeName: node01
targetRef:
kind: Pod
name: consul-dqr9z
namespace: consul
resourceVersion: "12690"
uid: d7bf8662-3449-4129-a00c-b5daaafefbf9
zone: node1
- addresses:
- 10.42.1.12
conditions:
ready: true
hints:
forZones:
- name: node3
nodeName: node03
targetRef:
kind: Pod
name: consul-w69wt
namespace: consul
resourceVersion: "12814"
uid: 0e8585a1-153a-4be3-9a30-d91df71e1fae
zone: node3
kind: EndpointSlice
metadata:
creationTimestamp: "2021-06-22T09:21:33Z"
generateName: consul-client-
generation: 95
labels:
endpointslice.kubernetes.io/managed-by: endpointslice-controller.k8s.io
kubernetes.io/service-name: consul-client
name: consul-client-8jwqn
namespace: consul
ownerReferences:
- apiVersion: v1
blockOwnerDeletion: true
controller: true
kind: Service
name: consul-client
uid: 923fcf4e-9d21-40da-a331-9ea7a6bc63aa
resourceVersion: "26396"
uid: 6a1d9abd-0a0c-4c46-b984-4b0f89e8c092
ports:
- name: grpc
port: 8502
protocol: TCP
- name: http
port: 8501
protocol: TCP
Hi @karras and @pree we are monitoring this issue and PR so just wanted to let you know. We were hoping to chat with you though about these changes in a meeting and I've asked the EMEA team to reach out to set up a call. Just wanted to acknowledge that we see this and see that you are making some progress here.
Much appreciated! We're currently working on polishing the PRs by adding missing unit tests, etc.
Hi @karras @pree
Thank you for your contribution. At a high level, we generally do want to remove our requirement for using a HostPort for Consul Kubernetes, and are looking to work with the Consul Core team to build an architecture that will no longer rely on a Consul client agent. This will likely take some time and is not something that we have targeted for our next major release.
At a glance, this seems like a good approach for accommodating Kubernetes environments that cannot support hostPort due to security concerns or because their CNI does allow it. As far as the security concern with host ports, exposing the agent API should be fine and within the threat model so long as TLS and ACLs are used.
Because of the large surface area of testing that needs to be in place due to this change, we are not prepared to accept this PR. This would introduce a large amount of testing for all of our features against different configurations introduced by this change. Based on the other work that we currently have within our Kubernetes roadmap, we don’t think we can currently accept the PRs to support this and provide the ongoing maintenance of such a feature if it were merged.
We understand that TKGI 1.11 does support host ports, and we currently recommend that for production scenarios you leverage host ports, since this is the most definitive way we can direct traffic from pods to agents that are local to that host. We will leave this branch here though to see if others find this PR useful, since we are not sure how applicable this to other folks not on TKGI. We do not intend to merge or support the PRs at this time, however we will provide some feedback next week as well on the PRs that were linked in this issue.
Thanks for the review and additional inputs, much appreciated! As discussed it can't be merged. Thus I'm closing this and the other related PRs and issues to prevent cluttering and prepare for the upcoming repo consolidation into consul-k8s.
Feature Description
Currently the Consul deployment on K8s relies on Kubernetes
hostPorts
to provide connectivity to the Consul clients as well as ensuring each app pod talks to its "K8s worker-local" Consul client. The design decision is well described in the Consul docs below:This causes various problems in organizations and environments where
hostPort
is not available, for example when running TKGI (previously known as PKS) < 1.11 or simply when security requirements do not allow you to use it.Therefore we (@flavioaiello, @pree & @karras) investigated various scenarios how the Consul deployment could be adjusted to avoid the
hostPort
option. This would not only solve the problem but also improve security related requirements in certain environments.We managed to successfully make the switch to
nodePorts
via 2-3 additional modifications:--service-node-port-range
option to include the ports8501
and8502
(e.g.8500-32767
)ServiceTopology
feature gate (orTopologyAwareHints
on K8s 1.21+)nodePort
service for the Consul clients withtopologyKeys
configured to avoid traffic loadbalancing and route everything to the local worker node (orEndpointSlice
on K8s 1.21+)Note: The adjustment of the
service-node-port-range
is only required due to the hardcoded ports in consul-k8s and consul-helm. We'll track this in separate PRs. See also: https://github.com/hashicorp/consul-k8s/issues/544Most likely making
nodePorts
the default deployment option would be desirable but I guess it's not possible because it would mean that every organization would need to enable the K8s feature gates. These are still in an alpha state and also disabled by default in vanilla K8s.Additional references:
Use Case(s)
hostPorts
are not an optionContributions
We anticipate to open a PR soon with a new consul-helm flag/variable to allow this kind of deployment option.