weaveworks / scope

Monitoring, visualisation & management for Docker & Kubernetes
https://www.weave.works/oss/scope/
Apache License 2.0
5.85k stars 709 forks source link

CRI: Gather network information in the CRI probe #3291

Open lilic opened 6 years ago

lilic commented 6 years ago

Currently the CRI probe lacks network connections between containers. We should decide on which approach to do it with. Either CRI, but at first glance there is no way to request the needed information on the IPs, etc. or via CNI? Maybe the best approach would be to just go through the k8s API?

cc @marccarre @bboreham

bboreham commented 6 years ago

Take a look at #3207 , which is under the docker tree but basically only needs a list of PIDs.

lilic commented 6 years ago

@bboreham Thanks for the link! Don't think we can get that via CRI API. Is there a way to somehow use CNI?

rade commented 6 years ago

Don't think we can get that via CRI API

What is 'that'?

AFAICT all that code needs is a PID. Or netns.

lilic commented 6 years ago

@rade

What is 'that'?

PID. We cannot get that via CRI API, as it is only set as part of the container creation, there is no list option for it just configuration. AFAICT.

Which is why I was asking @bboreham if we can get that info somewhere else instead?

bboreham commented 6 years ago

I think you can get via CRI:

ListContainers gets you a []Container, from which you get PodSandboxId then PodSandboxStatus() gets you a PodSandboxStatusResponse which has a PodSandboxNetworkStatus which has the IP.

dlespiau commented 6 years ago

We now have IPs, I think the next step is adding the docker_container_id latest key to processes, which is what the tagger (tagger.go) in the docker probe does.

// ContainerRenderer is a Renderer which produces a renderable container
// graph by merging the process graph and the container topology.
// NB We only want processes in container _or_ processes with network connections
// but we need to be careful to ensure we only include each edge once, by only
// including the ProcessRenderer once.
var ContainerRenderer = Memoise(MakeFilter(
    func(n report.Node) bool {
        // Drop deleted containers
        state, ok := n.Latest.Lookup(docker.ContainerState)
        return !ok || state != docker.StateDeleted
    },
    MakeReduce(
        MakeMap(
            MapProcess2Container,
            ProcessRenderer,
        ),
        ConnectionJoin(MapContainer2IP, report.Container),
    ),
))

and MapProcess2Container looks at the docker_container_id to do the job.

The docker tagger adds 2 things in the process topology, the "docker_container_id latest key and two parents: container and container_image:

  "launcher-tests;3278": {
        "id": "launcher-tests;3278",
        "topology": "process",
        "counters": null,
        "sets": null,
        "latest": {
          "cmdline": {
            "timestamp": "2018-08-22T11:20:33.908720218Z",
            "value": "kube-scheduler --address=127.0.0.1 --leader-elect=true --kubeconfig=/etc/kubernetes/scheduler.conf "
          },
          "docker_container_id": {
            "timestamp": "2018-08-22T11:20:33.921932431Z",
            "value": "085cf25caf229292927e2d7ce9cbd2c2403ac56e1b180c94baefc70eeb7c9026"
    ...
        "parents": {
          "host": [
            "launcher-tests;\u003chost\u003e"
          ],
          "container": [
            "085cf25caf229292927e2d7ce9cbd2c2403ac56e1b180c94baefc70eeb7c9026;\u003ccontainer\u003e"
          ],
          "container_image": [
            "k8s.gcr.io/kube-scheduler-amd64;\u003ccontainer_image\u003e"
          ]
        },
dlespiau commented 6 years ago

Also noticed a couple more things, started editing the original issue to include a TODO list we can maintain.

dlespiau commented 6 years ago

The problem we are now trying to solve is: find which host pid maps to pid 1 of the various containers.

The immediate thing that comes to mind is to ask CRI for the container pid, just like what we do in the docker probe. CRI has a provision to do that in ContainerStatus:

message ContainerStatusResponse {
    // Status of the container.
    ContainerStatus status = 1;
    // Info is extra information of the Container. The key could be arbitrary string, and
    // value should be in json format. The information could include anything useful for
    // debug, e.g. pid for linux container based container runtime.
    // It should only be returned non-empty when Verbose is true.
    map<string, string> info = 2;
}

Unfortunately, using this with cri-o leads to an empty info map and indeed, there's no code to populate that map: https://github.com/kubernetes-incubator/cri-o/blob/master/server/container_status.go#L33

bboreham commented 6 years ago

Looks like https://github.com/kubernetes-incubator/cri-o/blob/master/server/inspect.go#L113 will get it - ContainerInfo will serialise a pid member.

dlespiau commented 6 years ago

I checked the containerd implementation and they do return the pid already: https://github.com/containerd/cri/blob/master/pkg/server/container_status.go#L102.

A reasonable course of action to me is specifying a bit more the CRI interface, defining a few strings in that map that implementations should populate, taking what containerd does as reference. We can then open an issue/implement that support in cri-o.

Unfortunately minikube+containerd doesn't start for me atm.

dlespiau commented 6 years ago

@bboreham that /info/container/:id endpoint is served over http and not the CRI interface, we currently have no way to discover the HTTP endpoint unfortunately. I guess it could be an extra input parameter though.

We could also use it as a base to implement the CRI info map in ContainerStatusResponse.

dlespiau commented 6 years ago

Opened a cri-o issue so they are at least aware of something we'd love to have: https://github.com/kubernetes-incubator/cri-o/issues/1752.

dlespiau commented 6 years ago

Oh! I was wrong, the http enpoints (well, non gRPC calls) are available through the CRI socket! There's a client implementation of the the container info in https://github.com/kubernetes-incubator/cri-o/blob/master/client/client.go.