k8snetworkplumbingwg / multus-cni

A CNI meta-plugin for multi-homed pods in Kubernetes
Apache License 2.0
2.29k stars 577 forks source link

Kubernetes service for multus interface #466

Closed s1061123 closed 2 years ago

s1061123 commented 4 years ago

What would you like to be added:

This issue is used for update of multus service abstraction and its implementation will be under Kubernetes network plumbing working group repo.

Currenlty multus network interface (i.e. network attachment) does not support Kubernetes Service functionality. So this proposal is to add kubernetes controller and proxy (multus-proxy) to provide kubernetes services to multus network interface.

NEWS:

Roadmaps:

s1061123 commented 4 years ago

Currently I'm proposing a kubernetes enhancement proposal (KEP). This KEP provides more flexible endpoint controller (i.e. 3rd party controller could manage endpoint as well as kubernetes endpoint controller). This is not necessary to have, but kinds of 'good to have' feature.

https://github.com/kubernetes/enhancements/pull/1561

In this review, I need your help to collect use-cases for the KEP. If you're interested in, please let me know. Thanks.

JanScheurich commented 4 years ago

In the context of Telco CNFs we see the following major use case for the generalization of K8s services to Multus secondary network attachments:

  1. It shall be possible to host K8s service endpoint on secondary pod network attachments to securely separate networks and to support overlapping IP addresses in those VPNs. The network separation shall be end-to-end, i.e. separate external networks, separation inside the K8s cluster up to the secondary pod interfaces.

  2. The K8s service shall be made flexible to handle any TCP-, UDP- or SCTP-based protocol. In particular it shall be possible to also deploy NAT-less service load-balancers for protocols that do not work properly with NAT and/or to provide better performance.

  3. Secondary network attachments are often consumed by userspace applications for performance reasons, e.g. as DPDK network attachments (e.g. SRIOV VF, virtio-user, memif, SmartNIC VF). The standard kube-proxy service implementation cannot provide adequate performance. Special purpose service implementations (e.g. DPDK-based or using HW offload) are needed for such secondary networks.

The KEP should make sure that the K8s APIs become flexible enough to express all three aspects: The creation of service endpoints on a secondary network and the assignment of a dedicated/separated/specialized service forwarder adequate for the required protocol and type of secondary network attachments.

s1061123 commented 4 years ago

@JanScheurich Thank you for your comments!

Currently SIG-Network has NO interest/will to add multiple interface, yet. In the last KubeCon NA, at the SIG-Network presentation, someone asks Tim Hockin about to support secondary interface service, but he just rejected because it is not required. Hence I don't mention secondary interface as Kubernetes service in the KEP.

Currently I'm planning multus service abstraction with custom controller and my KEP could be for interworking with Kubernetes components and our custom controller.

Levovar commented 4 years ago

same as a year ago, feel free to re-use ideas / code: https://github.com/nokia/danm/tree/master/pkg/svccontrol :) if you would like to re-use parts too but not okay with the APIs (visibility, or dependency-wise) just let me know

otherwise, a suggestion: even if you really think proxy-based service routing is useful in a TelCo environment (which I personally don't), still I would strongly suggest to solve service discovery / endpoint management only at the beginning

JanScheurich commented 4 years ago

@Levovar: I fully agree. Let's focus on a clean API solution for service discovery and end-point management for secondary network attachments. It should be such that through annotation of service and/or NAD, a custom "proxy" can be inserted to load-balance the traffic if wanted/needed.

We have looked at (and like) DANM's service control but are not convinced that headless services alone are suitable for our use cases that mostly rely on NAT-less VIP addressing with direct server return.

But I would not exclude the possibility that there are services on secondary networks that are perfectly happy with kube-proxy, but just require the network separation.

JanScheurich commented 4 years ago

@s1061123: I'm still not sure I understand what problem annotating the Pod IPs with network labels would solve. I think we need to express per service 1) which network it lives on so that an endpoint controller can populate the endpoints for the service with the right IP addresses, and 2) how to proxy the service in the secondary network data plane (e.g. NAT/no-NAT, kernel/DPDK/SmartNIC forwarding plane).

The standard K8s service controller, endpoint controller and kube-proxy should ignore thus tagged (not to directly say annotated) services and custom Multus service and endpoint controllers could do the service/endpoint discovery instead.

The actual data plane realization should be left to one or more proxy implementations. These will surely depend on the CNI and underlying networking backend and should be fully plugable in this scheme.

For example, a solution built on OVS CNI and datapath would likely be using an OVS datapath, while for MacVlan CNI, a kernel based proxy (e.g. kube-proxy!) executing in a dedicated network namespace per secondary network might be the natural choice choice.

s1061123 commented 4 years ago

@Levovar Thank you for your comment. For now, I am thinking to use Kubernetes service controller mainly. My design focuses on 'how to use Kubernetes resource/component as much as possible'.

@JanScheurich I don't suppose it is perfect solution, of course. From transporting technology point of view, I suppose there is no perfect solution (satisfying following technology such as NAT, NAT-less, DPDK accelerated, eBPF, smartNIC and so on). In addition multus is just a reference plugin, different from danm (as far as I know, danm focus on ipvlan/sr-iov), hence our scope is diverse, including macvlan.

So my goal at present is provide minimum requirement for generic multus interface (macvlan, sr-iov and so on), not provide a perfect solution. If you are talking about the something fulfill the requirement of above (and you come up with any good idea), let's discuss about it in Kubernetes NPWG meeting.

For specific case, like OVN/OvS, I agree that it should not be iptables hence I suppose it should be implemented in OVN/OvS itself (because they're good at OvS, of course).

Levovar commented 4 years ago

@JanScheurich I just would like to highlight that this "pluggable, application/protocol dependent loadbalancer framework" was kind of the original goal of NSM all these years ago, and in my opinion it failed very quickly. The reason it "failed" -obviously not NSM itself, just the draft approach :)- is because IMHO it is impossible to write a generic load balancer capable of serving all our TelCo protocols. In fact, it is sometimes not even possible to write an application specific LB (I had my fair shair of struggles with LDAP in the past :) ) DIAMETER, LDAP, MAP, SIP, exotic IMS interfaces towards PLMN etc. all have their own different standards, different ways of how connections are established and maintained; so assuming that TelCo load-balancing can happen in the infrastructure level, and not in the application... is kind of far fetched for me

TL;DR version: load-balancing needs depend on the application traffic itself, not on the CNI solution used to configure the network interfaces. I can do both HTTPS and DIAMETER over IPVLAN or MACVLAN, but how I need to manage the two types of traffic vastly differ

just my two cents, and advanced warning :) historically speaking this is why we never bothered with the service routing part, because most of the TelCo applications will LB their own traffic anyway. hence my advice not to over-extend yourself here by trying to push too many responsibilities into the CaaS network manager layer, and concentrate only on those parts which are 1: useful today, 2: can be re-used by higher-level application functions to implement their own, protocol specific handling. and if somebody is not using any exotic protocols... they will anyway just use a mesh, right?

@s1061123 little correction: DANM contains a meta-plugin part which is just like Multus. you can use it with any CNIs. IP route management, IP management, Service watching etc. all generally supported features which work regardless which CNI created the network interface. Besides, that it also provides some extra features for dynamically integrated CNIs on top of the generic ones (VLAN / VxLAN management and connections), but that list also contains MACVLAN in addition to those you have mentioned :)

JanScheurich commented 4 years ago

@s1061123: I am not proposing to actually build the perfect solution that does everything. What I'm after is that we design a minimal API framework that provides sufficient flexibility to express the use cases and make it plugable for data planes (in a similar sense as Multus is plugable for CNIs). It should not preclude anyone from building the perfect service load-balancer for a particular CNI and data plane.

JanScheurich commented 4 years ago

@Levovar: True, there won't ever be a one-size fits all load-balancer for all kind of legacy TelCo protocols. That's not what we are after. Many applications have their built-in custom load-balancing/traffic steering mechanisms. But they still need to attract traffic for their service VIPs on secondary networks and they want to the have the ingress traffic "sprayed" over a number of front-end pods for scalability. A simple NAT- and state-less first load-balancer stage will often do. In the simplest case this might even be off-loaded to the DC-Gw using ECMP (along the lines of MetalLB).

But, as I said, I'm sure there is a wealth of use cases out there that might need more, including a stateful NAT load-balancer for secondary networks.

raoufkh commented 3 years ago

Hello !

As mentioned here https://github.com/kubernetes/enhancements/pull/1561#pullrequestreview-362111612

I had created a pod with a second interface (macvlan) using Multus. Then I created a service to expose the server binding to this interface

apiVersion: v1
kind: Service
metadata:
  name: myapp-service
  labels:
    app: myapp
spec:
  type: NodePort
  ports:
    - name: n2
      port: 38412
      protocol: SCTP   # I'm using Kubernetes v1.20.0 so SCTP is supported by default

The service didn't include the selector field and I created an endpoint with the same name as the service with the ip address of the second interface as a target:

apiVersion: v1
kind: Endpoints
metadata:
  name: myapp-service
subsets:
  - addresses:
      - ip: 10.100.100.12    # The IP address of the second interface
    ports:
      - port: 38412

When I try to access the server binding to the second interface by accessing myapp-service:38412 it fails and the logs in myapp doesn't show any incoming traffic

If I make my server binding to 0.0.0.0 (all interfaces) and I create the service (with selectors that match the target pod) it works but I want to make the server only listening on the second interface.

If make my server listening on 10.100.100.12 and I tried to access it from another pod by consuming directly its address it works but I want to use services to ensure load balancing and service discovery when multiple replicas of my pod are running.

Please do you have any advice for me?

Thank you in advance

s1061123 commented 3 years ago

As above, multus does not support kubernetes service for now. So if you want to do that, you need to implement it (not only multus, I suppose). Pull request is welcome!

raoufkh commented 3 years ago

Ah I'm not so good at coding but I'll see if I can find a solution to that.

Thank you for your response!

github-actions[bot] commented 3 years ago

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 7 days.

leseb commented 3 years ago

Please keep alive.

github-actions[bot] commented 3 years ago

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 7 days.

s1061123 commented 3 years ago

update

lsoica commented 2 years ago

Hi!

Without this capability, what is the canonical way to achieve service discovery and load balancing in case of multiple replicas, while in cluster, for secondary interfaces ?

s1061123 commented 2 years ago

As far as I know of, there is no way for that. Without it, of course, you can achieve it by implementing yourselves. Kubernetes can support pluggable schema, CRD and controller/operator, so you can implement it as you want to.

But it does not mean there is no solution all over the world. I hope someone, who already achieve it, reply to the thread to share their solution...

p.s. Please let me know if you implement functionality you want to have ;)

dougbtv commented 2 years ago

@lsoica I think Tomo is right on here. Basically what it comes down to is that secondary interfaces are "side cars" -- k8s itself doesn't really know about them. So, there's no way for it to handle them.

There's a long history to this (which I won't delve into now), but, basically what the k8s community, and especially sig-network decided about secondary interfaces is that this is something for CNI to handle and to happen outside of . So, there's no built-into-k8s way to handle services on secondary networks.

On top of that... Secondary networks are extremely free form and user configurable. This is both the power of the technology, and a limitation. It gives users lots of freedom to do what they need to do with these additional networks, and what connectivity these have. In Kubernetes itself, it promises connectivity from pod-to-pod over the default interface, and it's a promise that there's connectivity. For secondary networks, there's no inherent promise of what connectivity is afforded -- and this is a feature, as secondary networks are often used for isolation.

So the answer is... There is not an existing, canonical way. This proposal is the path forward, and is limited by the type of connectivity, and this proposal addresses those limitations and plans for extensibility depending on the usage of the secondary interfaces.

uablrek commented 2 years ago

@s1061123 Please update this issue with achievements in multus-service

s1061123 commented 2 years ago

As @uablrek mentioned, multus-service repo address the issue and try to solve, so this issue is closed.