kata-containers / kata-containers

Kata Containers is an open source project and community working to build a standard implementation of lightweight Virtual Machines (VMs) that feel and perform like containers, but provide the workload isolation and security advantages of VMs. https://katacontainers.io/
Apache License 2.0
5.46k stars 1.06k forks source link

Add kubernetes CRI Port-Forward support #1693

Open gnawux opened 6 years ago

gnawux commented 6 years ago

In CRI spec, there is a Port-Forward method, which forwards a user request stream to a port of the container. (ref: the original CRI streaming design document)

The implementation of this API in runC is nsenter + socat in brief. For hyper runV, we once did this by exec a socat in the sandbox but not in any user containers. (as follows)

                         +-------------------------------------------+
                         | sandbox                                   |
                         |              +-----------+ +-----------+  |
                         |              | container | | container |  |
                         |              +-----+-----+ +-----------+  |
                         |                    ^                      |
          exec           |  +-------+         |                      |
stream +------------------> | socat +---------+                      |
                         |  +-------+                                |
                         +-------------------------------------------+

For kata containers, we have at least two options to implement this

Add a new agent API to support this behavior

The agent could just do the stream termination and forward traffic to the port of the sandbox network.

                         +-------------------------------------------+
                         | sandbox                                   |
                         |              +-----------+ +-----------+  |
                         |              | container | | container |  |
                         |              +-----+-----+ +-----------+  |
                         |                    ^                      |
          PortForward()  |  +-------+         |                      |
stream +------------------> | agent +---------+                      |
                         |  +-------+                                |
                         +-------------------------------------------+

Intercept the network traffic of the containers from outside of the sandbox

I think this may lead to different ways for different CNI implementations

Any ideas, guys?

jodh-intel commented 6 years ago

I think we need @mcastelino to comment on these options.

gnawux commented 6 years ago

I'd like to say actually this is mostly not a networking issue 😃

On Tue, Sep 4, 2018, 20:21 James O. D. Hunt notifications@github.com wrote:

I think we need @mcastelino https://github.com/mcastelino to comment on these options.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/kata-containers/runtime/issues/697#issuecomment-418346818, or mute the thread https://github.com/notifications/unsubscribe-auth/AADTbITcjB3CiRBSoYFI4-vw4nUlgcQ3ks5uXnBmgaJpZM4WYwc- .

mcastelino commented 6 years ago

@gnawux in the current Kata network implementation using socat+nsenter method does not really work, as there is no network connectivity between the namespace in which QEMU is running and the network stack within QEMU.

Do you have network connectivity implemented with the tc mirroring approach implemented in runV? i.e. network connectivity between the host side namespace and the VM network stack.

The only other option I see is to use an approach similar to the shim where we bridge the stdin/out from the host into the VM?

So the equivalent of /usr/bin/nsenter -t <container pid> -n /usr/bin/socat - TCP4:localhost:<container port> would have to be done by a combination of shim and agent. Which means our agent will need to play to role that socat plays in the current solution. Or we have to include and use socat within the VM and agent/shim only manages stdio.

/cc @amshinde

mcastelino commented 6 years ago

@gnawux It also looks like the current CRIO implementation is also very namespace specific.

https://github.com/kubernetes-incubator/cri-o/blob/f9ae39e395880507d52295ca58e3683f22524777/server/container_portforward.go#L35

which assumes host side namespace having network connectivity into the VM.

amshinde commented 6 years ago

The same is done in container-cri as well: https://github.com/containerd/cri/blob/0e42438e7a157f6aec41cd808a85bc883d646ff3/pkg/server/sandbox_portforward.go#L51

Both cri-o and cri shim assume a network namespace, and then do a /usr/bin/nsenter -t <container pid> -n /usr/bin/socat - TCP4:localhost:<container port> . This should really be handled as path of shimv2 API so that the OCI runtime can handle the port forwarding. In our case we can do this by having an agent API that runs socat within the VM.

sboeuf commented 6 years ago

How to do the detection? The first problem with this CRI feature is that it does not translate into any OCI call, which means the current CRI implementations don't have any way to notify the container through the runtime. Instead, they enter the container namespaces and start a socat process directly inside the container. So how are we supposed to detect that something happens, from a runtime perspective? The way things are implemented, there is no solution for now...

Now, could we propose some changes to the CRI implementations so that we could actually be notified?

We can still provide the codepath for the Kata API so that Frakti (as it relies directly on the Kata API), could simply ask the runtime (and hence the agent) to run socat inside the VM, and to create a new virtio-serial-pci socket in order to pipe the stream out of the VM to the host (or reusing the existing virtio-vsock-pci in case vsock is supported). But I don't think we'll implement this feature for the kata-runtime CLI.

Brighter future with v2 interfaces Now, if we want to be optimistic, with the new interfaces containerd-shim-v2 and the equivalent that should come up for CRI-O, we could simply ask for a new function to be added to the API. This way, no implicit way of implementing port forwarding, but instead, a clear path to call into this new feature. And of course, this would use the same codepath that we would have implemented for Frakti.

@gnawux @bergwolf @mcastelino @amshinde

gnawux commented 6 years ago

we have the following options

sboeuf commented 6 years ago

@gnawux Yes, good summary of the options, but in any case, they need to be discussed with maintainers of CRI implementations as they will be the one accepting those changes.

bergwolf commented 6 years ago

cons: we must have a container image have socat inside, and the image must exist in the host (but we couldn't guarantee this with code).

We can builtin a socat container image in the guest rootfs. Then the runtime is always free to setup a proper container spec to create a socat sidecar.

gnawux commented 6 years ago

will propose to kubernetes sig-node

gnawux commented 6 years ago

Just proposed some thoughts in kubernetes sig-node meeting (slides).

PatrickLang commented 6 years ago

The slides say that 4 conformance tests are failing. Which test cases are they?

gnawux commented 6 years ago

@PatrickLang here is the full test output

PatrickLang commented 6 years ago

I just read over these, and I thought they were using exec and stdout redirection, not port-forward. Am I misunderstanding their intent?

[Fail] [sig-network] Networking Granular Checks: Pods [It] should function for intra-pod communication: http [NodeConformance] [Conformance] 

/root/code/kubernetes/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/framework/networking_utils.go:226

[Fail] [sig-network] Networking Granular Checks: Pods [It] should function for node-pod communication: udp [NodeConformance] [Conformance] 

/root/code/kubernetes/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/framework/networking_utils.go:337

[Fail] [sig-network] Networking Granular Checks: Pods [It] should function for node-pod communication: http [NodeConformance] [Conformance] 

/root/code/kubernetes/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/framework/networking_utils.go:337

[Fail] [sig-network] Networking Granular Checks: Pods [It] should function for intra-pod communication: udp [NodeConformance] [Conformance] 

/root/code/kubernetes/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/framework/networking_utils.go:226
gnawux commented 6 years ago

@PatrickLang Thanks for pointing out. @lifupan re-did some investigation on them, and finally, he found out the root reason is host-network mode, which is disabled in kata containers. He adjusted the cases to avoid using host network, and they passed. Sorry for the misleading.

gnawux commented 4 years ago

just updated the broken slide link in https://github.com/kata-containers/runtime/issues/697#issuecomment-420372791

egernst commented 1 year ago

Draft implementation: https://github.com/kata-containers/kata-containers/pull/5979

mkulke commented 3 weeks ago

A while ago I wrote kube-relay, which automates deploying a container with socat and creating a port-forwarding to it. this addresses cases in which you want to connect to cluster-accessible destinations from a local development machine. @bpradipt mentioned that this might be helpful as temporary mitigation for this issue, and we could consider adding the tool to the kata repo. If ppl find this useful, I'm open to do that.

bpradipt commented 3 weeks ago

cc @fidencio @gkurz