Open gnawux opened 6 years ago
I think we need @mcastelino to comment on these options.
I'd like to say actually this is mostly not a networking issue 😃
On Tue, Sep 4, 2018, 20:21 James O. D. Hunt notifications@github.com wrote:
I think we need @mcastelino https://github.com/mcastelino to comment on these options.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/kata-containers/runtime/issues/697#issuecomment-418346818, or mute the thread https://github.com/notifications/unsubscribe-auth/AADTbITcjB3CiRBSoYFI4-vw4nUlgcQ3ks5uXnBmgaJpZM4WYwc- .
@gnawux in the current Kata network implementation using socat+nsenter method does not really work, as there is no network connectivity between the namespace in which QEMU is running and the network stack within QEMU.
Do you have network connectivity implemented with the tc mirroring approach implemented in runV? i.e. network connectivity between the host side namespace and the VM network stack.
The only other option I see is to use an approach similar to the shim where we bridge the stdin/out from the host into the VM?
So the equivalent of /usr/bin/nsenter -t <container pid> -n /usr/bin/socat - TCP4:localhost:<container port>
would have to be done by a combination of shim and agent. Which means our agent will need to play to role that socat plays in the current solution. Or we have to include and use socat within the VM and agent/shim only manages stdio.
/cc @amshinde
@gnawux It also looks like the current CRIO implementation is also very namespace specific.
which assumes host side namespace having network connectivity into the VM.
The same is done in container-cri as well: https://github.com/containerd/cri/blob/0e42438e7a157f6aec41cd808a85bc883d646ff3/pkg/server/sandbox_portforward.go#L51
Both cri-o and cri shim assume a network namespace, and then do a /usr/bin/nsenter -t <container pid> -n /usr/bin/socat - TCP4:localhost:<container port>
. This should really be handled as path of shimv2 API so that the OCI runtime can handle the port forwarding. In our case we can do this by having an agent API that runs socat within the VM.
How to do the detection?
The first problem with this CRI feature is that it does not translate into any OCI call, which means the current CRI implementations don't have any way to notify the container through the runtime. Instead, they enter the container namespaces and start a socat
process directly inside the container.
So how are we supposed to detect that something happens, from a runtime perspective? The way things are implemented, there is no solution for now...
Now, could we propose some changes to the CRI implementations so that we could actually be notified?
We can still provide the codepath for the Kata API so that Frakti
(as it relies directly on the Kata API), could simply ask the runtime (and hence the agent) to run socat
inside the VM, and to create a new virtio-serial-pci
socket in order to pipe the stream out of the VM to the host (or reusing the existing virtio-vsock-pci
in case vsock is supported).
But I don't think we'll implement this feature for the kata-runtime
CLI.
Brighter future with v2 interfaces
Now, if we want to be optimistic, with the new interfaces containerd-shim-v2
and the equivalent that should come up for CRI-O, we could simply ask for a new function to be added to the API. This way, no implicit way of implementing port forwarding, but instead, a clear path to call into this new feature.
And of course, this would use the same codepath that we would have implemented for Frakti
.
@gnawux @bergwolf @mcastelino @amshinde
we have the following options
socat
in vm rootfs, and enable agent to exec it in vm (not in any container) when port forward is called
socat
container when port forward is called
socat
inside, and the image must exist in the host (but we couldn't guarantee this with code).@gnawux Yes, good summary of the options, but in any case, they need to be discussed with maintainers of CRI implementations as they will be the one accepting those changes.
cons: we must have a container image have socat inside, and the image must exist in the host (but we couldn't guarantee this with code).
We can builtin a socat container image in the guest rootfs. Then the runtime is always free to setup a proper container spec to create a socat sidecar.
will propose to kubernetes sig-node
Just proposed some thoughts in kubernetes sig-node meeting (slides).
The slides say that 4 conformance tests are failing. Which test cases are they?
@PatrickLang here is the full test output
I just read over these, and I thought they were using exec and stdout redirection, not port-forward. Am I misunderstanding their intent?
[Fail] [sig-network] Networking Granular Checks: Pods [It] should function for intra-pod communication: http [NodeConformance] [Conformance]
/root/code/kubernetes/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/framework/networking_utils.go:226
[Fail] [sig-network] Networking Granular Checks: Pods [It] should function for node-pod communication: udp [NodeConformance] [Conformance]
/root/code/kubernetes/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/framework/networking_utils.go:337
[Fail] [sig-network] Networking Granular Checks: Pods [It] should function for node-pod communication: http [NodeConformance] [Conformance]
/root/code/kubernetes/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/framework/networking_utils.go:337
[Fail] [sig-network] Networking Granular Checks: Pods [It] should function for intra-pod communication: udp [NodeConformance] [Conformance]
/root/code/kubernetes/src/k8s.io/kubernetes/_output/local/go/src/k8s.io/kubernetes/test/e2e/framework/networking_utils.go:226
@PatrickLang Thanks for pointing out. @lifupan re-did some investigation on them, and finally, he found out the root reason is host-network mode, which is disabled in kata containers. He adjusted the cases to avoid using host network, and they passed. Sorry for the misleading.
just updated the broken slide link in https://github.com/kata-containers/runtime/issues/697#issuecomment-420372791
Draft implementation: https://github.com/kata-containers/kata-containers/pull/5979
A while ago I wrote kube-relay, which automates deploying a container with socat and creating a port-forwarding to it. this addresses cases in which you want to connect to cluster-accessible destinations from a local development machine. @bpradipt mentioned that this might be helpful as temporary mitigation for this issue, and we could consider adding the tool to the kata repo. If ppl find this useful, I'm open to do that.
cc @fidencio @gkurz
In CRI spec, there is a Port-Forward method, which forwards a user request stream to a port of the container. (ref: the original CRI streaming design document)
The implementation of this API in runC is
nsenter
+socat
in brief. For hyper runV, we once did this byexec
asocat
in the sandbox but not in any user containers. (as follows)For kata containers, we have at least two options to implement this
Add a new agent API to support this behavior
The agent could just do the stream termination and forward traffic to the port of the sandbox network.
Intercept the network traffic of the containers from outside of the sandbox
I think this may lead to different ways for different CNI implementations
Any ideas, guys?