antrea-io / antrea

Kubernetes networking based on Open vSwitch
https://antrea.io
Apache License 2.0
1.67k stars 371 forks source link

Support port mirror for assigned Pods #3008

Closed leonstack closed 1 year ago

leonstack commented 3 years ago

Describe the problem/challenge you have Some times we have a remote server wants to analyse the packets which translate from/to some pods, so we need to pass these packets to the server, and also won't break the original translation for the pod.

Describe the solution you'd like

  1. We can create a ovs bridge named br-mirror, and connect br-mirror with br-int by patch port, for example: br-int--br-mirror(on br-int) and br-mirror--br-int(on br-mirror)
  2. Create mirror for assigned pod's port, send the packets which from/to the port to the patch port br-int--br-mirror, for example, coredns:
    ovs-vsctl -- --id=@coredns--4c0d8d get port coredns--4c0d8d \
    -- --id=@br-int--br-mirror  get port br-int--br-mirror  \
    -- --id=@m create mirror name=m0 select-src-port=@coredns--4c0d8d output-port=@br-int--br-mirror \
    -- set bridge br-int mirrors=@m
  3. Create vtep port on bridge br-mirror for remote server. ovs-vsctl add-port br-mirror tun -- set interface tun type=vxlan options:remote_ip=172.18.123.12
  4. Add some more openflows to support mirror packets filter, we can assigned destination port and protocol to filter packets for remote server, and other packets will be dropped, then use actions=output:tun to send the packet to the remote server.

Anything else you would like to add?

  1. We can create an new CRD named Mirror for this feature, which include podSelector, remoteIP, destinationPort, protocol, direction.
  2. Step 3,4 can also use openflow to assigned remoteIP by NXM_NX_TUN_IPV4_DST[]
  3. The remote servers need to handle the vxlan encapsulate packets themselves.
antoninbas commented 3 years ago

@leonstack Thanks for the detailed issue. Could you clarify why you propose to introduce a new bridge (br-mirror) to handle the mirrored traffic and send it to the sink? Why not create the tunnel port on br-int and use this port as the output-port when configuring the mirroring?

leonstack commented 3 years ago

@leonstack Thanks for the detailed issue. Could you clarify why you propose to introduce a new bridge (br-mirror) to handle the mirrored traffic and send it to the sink? Why not create the tunnel port on br-int and use this port as the output-port when configuring the mirroring?

Hi, because the mirror function will forward the packets to the output-port directly, there is no chance to filter the packets: If we only want to get the packet for destinationPort=80, we need a pipeline to filter such packets, and other packets will be dropped when after the mirror action.

jianjuns commented 3 years ago

Hi, because the mirror function will forward the packets to the output-port directly, there is no chance to filter the packets: If we only want to get the packet for destinationPort=80, we need a pipeline to filter such packets, and other packets will be dropped when after the mirror action.

So, the assumption is users can manually add flows to the mirror bridge to filter packets?

What you think about a solution of creating a DaemonSet or adding a separate container to the antrea-agent DS to create the mirror bridge and enable mirroring (e.g. using ovs-vsctl command)? In this way, we need not to change Antrea code for that.

And what you think about ERSPAN for the use case?

jianjuns commented 3 years ago

What you think about a solution of creating a DaemonSet or adding a separate container to the antrea-agent DS to create the mirror bridge and enable mirroring (e.g. using ovs-vsctl command)? In this way, we need not to change Antrea code for that.

I guess you want to programmably control what Pods should be mirrored (with a CRD?), and so this way does not work. @leonstack

leonstack commented 3 years ago

What you think about a solution of creating a DaemonSet or adding a separate container to the antrea-agent DS to create the mirror bridge and enable mirroring (e.g. using ovs-vsctl command)? In this way, we need not to change Antrea code for that.

I guess you want to programmably control what Pods should be mirrored (with a CRD?), and so this way does not work. @leonstack

@jianjuns Thanks for your reply. Yes, use ovs-vsctl command can do such thing, but not automaticlly enough, if pod failed and recreated on another host, the flows created on the old host won't work any more. With CRD, we can watch the pods' lifecycle, and keep the flows binding to the host which pod belong to, even the pod recreated on another host.

I‘m not quite familiar with ERSPAN, seems it's based on GRE Tunnel, I think we can choose the network tunnel type to adapt any remote server (Of course, we need to take care of pod's MTU, too).

tnqn commented 3 years ago

@leonstack I was thinking about adding something similar to Antrea to support several use cases. Maybe it can meet your requirement too. The proposal is to add a CRD to declare the Traffic Control configuration similar to tc-mirred. The CR will select traffic based on from/to Pods, direction, L4 protocol and port and apply mirror or redirect action on it. An example is:

apiVersion: crd.antrea.io/v1alpha1
kind: TrafficControl
metadata:
  name: mirror-web-traffic
spec:
  podSelector:
    matchLabels:
      app: web
  namespaceSelector: {}
  direction: In
  action: Mirror
  device: analyzer0

The capability itself is generic and could be used in many scenarios. In your use case, I think you could just create an OVS tunnel device on each node via a separate job, then create a CR to mirror specific traffic to that device, without touching the OpenFlow pipeline and OVS configuration much. Do you think it could work?

BTW, I didn't plan to use an extra bridge and mirror object to mirror and filter the traffic as the mechanism doesn't work for redirect action and it will still make a copy locally for unwanted traffic. I plan to have some OpenFlow rules to mark the traffic based on the criterias and then output the traffic twice: one to the actual destination, one to the mirror receiver device. Do you see any cons on it?

leonstack commented 2 years ago

@leonstack I was thinking about adding something similar to Antrea to support several use cases. Maybe it can meet your requirement too. The proposal is to add a CRD to declare the Traffic Control configuration similar to tc-mirred. The CR will select traffic based on from/to Pods, direction, L4 protocol and port and apply mirror or redirect action on it. An example is:

apiVersion: crd.antrea.io/v1alpha1
kind: TrafficControl
metadata:
  name: mirror-web-traffic
spec:
  podSelector:
    matchLabels:
      app: web
  namespaceSelector: {}
  direction: In
  action: Mirror
  device: analyzer0

The capability itself is generic and could be used in many scenarios. In your use case, I think you could just create an OVS tunnel device on each node via a separate job, then create a CR to mirror specific traffic to that device, without touching the OpenFlow pipeline and OVS configuration much. Do you think it could work?

BTW, I didn't plan to use an extra bridge and mirror object to mirror and filter the traffic as the mechanism doesn't work for redirect action and it will still make a copy locally for unwanted traffic. I plan to have some OpenFlow rules to mark the traffic based on the criterias and then output the traffic twice: one to the actual destination, one to the mirror receiver device. Do you see any cons on it?

Yes, it can meet my requirement. After some consideration, filter the traffic seems useless, because mostly the pod is running for one progress, which bind to one port, so ignore this requirement.

github-actions[bot] commented 2 years ago

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment, or this will be closed in 90 days

tnqn commented 2 years ago

@leonstack Antrea v1.7.0 introduced a TrafficControl API as https://github.com/antrea-io/antrea/issues/3008#issuecomment-982703409 mentioned, could you see if it can meet your requirement? https://github.com/antrea-io/antrea/blob/main/docs/traffic-control.md

github-actions[bot] commented 2 years ago

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment, or this will be closed in 90 days