weaveworks / scope

Monitoring, visualisation & management for Docker & Kubernetes
https://www.weave.works/oss/scope/
Apache License 2.0
5.85k stars 709 forks source link

Scope no longer works on Rancher 1.2 (uses CNI networking) #2164

Open johnrengelman opened 7 years ago

johnrengelman commented 7 years ago

With Rancher 1.2, Rancher has moved to CNI for networking, so the container IP is no longer available via docker inspect. This causes the graph to lose connection edges.

I'm curious what the solution for this is. I'm not familiar with the weave scope code, so would it require a custom Rancher plugin for probe to discover that information?

I'd be willing to take a shot at implementing this.

2opremio commented 7 years ago

Thanks for reporting this @johnrengelman !

the container IP is no longer available via docker inspect

Do you happen to know the reason for this? (I have never personally used Rancher)

I am asking because Scope works with Kubernetes (and CNI plugins), and the containers still show their IP with docker inspect

More importantly, do you know how could we obtain the IP of the containers?

would it require a custom Rancher plugin for probe

This is a core piece of the Scope functionality, so I wouldn't put it in a plugin, but directly in Scope.

I am happy to guide you once we figure out how to obtain the IPs (for normal containers is done here and the networks from Docker plugins are obtained here)

johnrengelman commented 7 years ago

Rancher is launching the containers with --net=none, so there's no networking information in docker inspect. I'm gonna end up quickly out of my realm so I'll tag some Rancher folks at the end and perhaps they can offer insight.

In Rancher: image

Inspecting that container:

$ docker inspect 41e077596904
[
    {
        "Id": "41e077596904c536724058bbd50ebd087e2acdfef343ee151d3eb3c08ef36a12",
        "Created": "2017-01-27T18:27:16.093204072Z",
        "Path": "/.r/r",
        "Args": [
            "/rancher-entrypoint.sh",
            "/tini",
            "--",
            "healthcheck"
        ],
        "State": {
            "Status": "running",
            "Running": true,
            "Paused": false,
            "Restarting": false,
            "OOMKilled": false,
            "Dead": false,
            "Pid": 8694,
            "ExitCode": 0,
            "Error": "",
            "StartedAt": "2017-01-27T18:27:16.523591188Z",
            "FinishedAt": "0001-01-01T00:00:00Z"
        },
        "Image": "sha256:491349141109e422c7ff9c1c568d3f6d6563e2f5e3667d9171c4407ff9713fd1",
        "ResolvConfPath": "/var/lib/docker/containers/41e077596904c536724058bbd50ebd087e2acdfef343ee151d3eb3c08ef36a12/resolv.conf",
        "HostnamePath": "/var/lib/docker/containers/41e077596904c536724058bbd50ebd087e2acdfef343ee151d3eb3c08ef36a12/hostname",
        "HostsPath": "/var/lib/docker/containers/41e077596904c536724058bbd50ebd087e2acdfef343ee151d3eb3c08ef36a12/hosts",
        "LogPath": "/var/lib/docker/containers/41e077596904c536724058bbd50ebd087e2acdfef343ee151d3eb3c08ef36a12/41e077596904c536724058bbd50ebd087e2acdfef343ee151d3eb3c08ef36a12-json.log",
        "Name": "/r-healthcheck-healthcheck-1-25d1232c",
        "RestartCount": 0,
        "Driver": "overlay2",
        "MountLabel": "",
        "ProcessLabel": "",
        "AppArmorProfile": "",
        "ExecIDs": null,
        "HostConfig": {
            "Binds": [
                "/var/lib/rancher/etc:/var/lib/rancher/etc:ro",
                "rancher-cni:/.r:ro"
            ],
            "ContainerIDFile": "",
            "LogConfig": {
                "Type": "json-file",
                "Config": {
                    "max-file": "2",
                    "max-size": "25m"
                }
            },
            "NetworkMode": "none",
            "PortBindings": {},
            "RestartPolicy": {
                "Name": "",
                "MaximumRetryCount": 0
            },
            "AutoRemove": false,
            "VolumeDriver": "",
            "VolumesFrom": null,
            "CapAdd": null,
            "CapDrop": null,
            "Dns": [
                "169.254.169.250"
            ],
            "DnsOptions": null,
            "DnsSearch": [
                "healthcheck.rancher.internal",
                "healthcheck.healthcheck.rancher.internal",
                "rancher.internal"
            ],
            "ExtraHosts": null,
            "GroupAdd": null,
            "IpcMode": "",
            "Cgroup": "",
            "Links": null,
            "OomScoreAdj": 0,
            "PidMode": "",
            "Privileged": false,
            "PublishAllPorts": false,
            "ReadonlyRootfs": false,
            "SecurityOpt": null,
            "UTSMode": "",
            "UsernsMode": "",
            "ShmSize": 67108864,
            "Runtime": "runc",
            "ConsoleSize": [
                0,
                0
            ],
            "Isolation": "",
            "CpuShares": 2,
            "Memory": 0,
            "NanoCpus": 0,
            "CgroupParent": "",
            "BlkioWeight": 0,
            "BlkioWeightDevice": null,
            "BlkioDeviceReadBps": null,
            "BlkioDeviceWriteBps": null,
            "BlkioDeviceReadIOps": null,
            "BlkioDeviceWriteIOps": null,
            "CpuPeriod": 0,
            "CpuQuota": 0,
            "CpuRealtimePeriod": 0,
            "CpuRealtimeRuntime": 0,
            "CpusetCpus": "",
            "CpusetMems": "",
            "Devices": [],
            "DiskQuota": 0,
            "KernelMemory": 0,
            "MemoryReservation": 0,
            "MemorySwap": 0,
            "MemorySwappiness": -1,
            "OomKillDisable": false,
            "PidsLimit": 0,
            "Ulimits": null,
            "CpuCount": 0,
            "CpuPercent": 0,
            "IOMaximumIOps": 0,
            "IOMaximumBandwidth": 0
        },
        "GraphDriver": {
            "Name": "overlay2",
            "Data": {
                "LowerDir": "/var/lib/docker/overlay2/001433bf0ae48aef5d006ebdaa9bf658ee6ed68c2ef70b01b7c9a8fded7c3b84-init/diff:/var/lib/docker/overlay2/68ba11bc79f38b76cb02d62b0748514a39b171bc001b8390a03b936b0fd35d89/diff:/var/lib/docker/overlay2/71a8e59b6ebb3f33e729521765762acd23896ea89ce36a2fe3713fa860183d51/diff:/var/lib/docker/overlay2/d9e4dc65209eaa82ce5b308c82b730c89743e092bbc5a001ec8645964c66cc6f/diff:/var/lib/docker/overlay2/20c32fea6882243aff1617710f592a213c2690f6dba5448491f49922dd7ed6bc/diff:/var/lib/docker/overlay2/a93e47f26a3348f8da1929a405b709f5384a1e6f686a8b6b495bdc9601b73b55/diff:/var/lib/docker/overlay2/bfd693279421982eaae386ba9d29e52d3d67e9adef97560f444b91407e64185f/diff:/var/lib/docker/overlay2/384811119a03eef3a34bf9e67a966bb98386e6c809a5fba3a1094d7be6258f7c/diff:/var/lib/docker/overlay2/1b841b62b96b96155c675bfcd4f4b403029ee101d2d7cb888bebdd04405cbe2f/diff:/var/lib/docker/overlay2/74a98d8e859e4d56acef79d2c3decbb1128e74968e9cd219dcd18c1535513018/diff:/var/lib/docker/overlay2/edf955cad19a5368b293583d965abcc87ade88c15a560f3af9793e8ecaab3836/diff:/var/lib/docker/overlay2/ff7b93537e7531e6a5dbc31b75a7d2e0e1f8366ebcca974072b4f9e02e7df1c9/diff:/var/lib/docker/overlay2/b9b4caa59c1a2904d7967c70bdf322940991960a57db83209e5a9b081e5db64f/diff:/var/lib/docker/overlay2/d112ece2cb05755aaaeb434d359797ffb5ccdd9b76bf800ae03dbae7439a0e3e/diff",
                "MergedDir": "/var/lib/docker/overlay2/001433bf0ae48aef5d006ebdaa9bf658ee6ed68c2ef70b01b7c9a8fded7c3b84/merged",
                "UpperDir": "/var/lib/docker/overlay2/001433bf0ae48aef5d006ebdaa9bf658ee6ed68c2ef70b01b7c9a8fded7c3b84/diff",
                "WorkDir": "/var/lib/docker/overlay2/001433bf0ae48aef5d006ebdaa9bf658ee6ed68c2ef70b01b7c9a8fded7c3b84/work"
            }
        },
        "Mounts": [
            {
                "Type": "bind",
                "Source": "/var/lib/rancher/etc",
                "Destination": "/var/lib/rancher/etc",
                "Mode": "ro",
                "RW": false,
                "Propagation": ""
            },
            {
                "Type": "volume",
                "Name": "rancher-cni",
                "Source": "/var/lib/docker/volumes/rancher-cni/_data",
                "Destination": "/.r",
                "Driver": "local",
                "Mode": "ro",
                "RW": false,
                "Propagation": ""
            }
        ],
        "Config": {
            "Hostname": "41e077596904",
            "Domainname": "",
            "User": "",
            "AttachStdin": false,
            "AttachStdout": false,
            "AttachStderr": false,
            "Tty": false,
            "OpenStdin": false,
            "StdinOnce": false,
            "Env": [
                "CATTLE_AGENT_INSTANCE_AUTH=Basic OTI3REJCRjMyQUZCRTFGRUY5NEI6bTVyTFo0RmNudjMyN1lIcWFvWVNWZVRvUjNaNFZ6WlNueTc4TUhYTQ==",
                "CATTLE_SECRET_KEY=m5rLZ4Fcnv327YHqaoYSVeToR3Z4VzZSny78MHXM",
                "CATTLE_ACCESS_KEY=927DBBF32AFBE1FEF94B",
                "no_proxy=*.local, 169.254/16",
                "CATTLE_CONFIG_URL=http://192.168.184.155:8080/v1",
                "CATTLE_URL=http://192.168.184.155:8080/v1",
                "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
                "SSL_SCRIPT_COMMIT=98660ada3d800f653fc1f105771b5173f9d1a019",
                "TINI_VERSION=v0.10.0"
            ],
            "Cmd": [
                "healthcheck"
            ],
            "Healthcheck": {},
            "Image": "rancher/healthcheck:v0.2.3",
            "Volumes": {
                "/var/lib/rancher/etc": {}
            },
            "WorkingDir": "",
            "Entrypoint": [
                "/.r/r",
                "/rancher-entrypoint.sh",
                "/tini",
                "--"
            ],
            "OnBuild": null,
            "Labels": {
                "io.rancher.cni.network": "ipsec",
                "io.rancher.cni.wait": "true",
                "io.rancher.container.agent_id": "3",
                "io.rancher.container.create_agent": "true",
                "io.rancher.container.ip": "10.42.140.196/16",
                "io.rancher.container.mac_address": "02:60:71:03:ca:3b",
                "io.rancher.container.name": "healthcheck-healthcheck-1",
                "io.rancher.container.uuid": "25d1232c-6876-41ff-ba02-9a929793382a",
                "io.rancher.project.name": "healthcheck",
                "io.rancher.project_service.name": "healthcheck/healthcheck",
                "io.rancher.scheduler.global": "true",
                "io.rancher.service.deployment.unit": "40c8fd57-3125-4ef1-8d97-b5d443021af5",
                "io.rancher.service.hash": "d0a8fd4061d3b2a8c5782f5563db6df0b25655cb",
                "io.rancher.service.launch.config": "io.rancher.service.primary.launch.config",
                "io.rancher.service.requested.host.id": "1",
                "io.rancher.stack.name": "healthcheck",
                "io.rancher.stack_service.name": "healthcheck/healthcheck"
            }
        },
        "NetworkSettings": {
            "Bridge": "",
            "SandboxID": "904b79ee06edde2a7f07c5a36b202c4349f4d61f55331f6290979e12dd620150",
            "HairpinMode": false,
            "LinkLocalIPv6Address": "",
            "LinkLocalIPv6PrefixLen": 0,
            "Ports": {},
            "SandboxKey": "/var/run/docker/netns/904b79ee06ed",
            "SecondaryIPAddresses": null,
            "SecondaryIPv6Addresses": null,
            "EndpointID": "",
            "Gateway": "",
            "GlobalIPv6Address": "",
            "GlobalIPv6PrefixLen": 0,
            "IPAddress": "",
            "IPPrefixLen": 0,
            "IPv6Gateway": "",
            "MacAddress": "",
            "Networks": {
                "none": {
                    "IPAMConfig": null,
                    "Links": null,
                    "Aliases": null,
                    "NetworkID": "c6c6b466dc11b6ddd2c85fd9224128f27e2d1208e50e47b0d4decd8d3db197e7",
                    "EndpointID": "1ac346f1b5dcbab62069157cecd6e48374bdfccccfd134f7095521c92468a145",
                    "Gateway": "",
                    "IPAddress": "",
                    "IPPrefixLen": 0,
                    "IPv6Gateway": "",
                    "GlobalIPv6Address": "",
                    "GlobalIPv6PrefixLen": 0,
                    "MacAddress": ""
                }
            }
        }
    }
]

Checking Docker Networks:

$ docker network ls
NETWORK ID          NAME                DRIVER              SCOPE
2d56e3eb6a6d        bridge              bridge              local
52aba9e00221        host                host                local
c6c6b466dc11        none                null                local

Notice the last one corresponds to the NetworkID from docker inspect.

$ docker network inspect c6c6b466dc11
[
    {
        "Name": "none",
        "Id": "c6c6b466dc11b6ddd2c85fd9224128f27e2d1208e50e47b0d4decd8d3db197e7",
        "Created": "2017-01-26T15:17:22.370980701Z",
        "Scope": "local",
        "Driver": "null",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": []
        },
        "Internal": false,
        "Attachable": false,
        "Containers": {
            "41e077596904c536724058bbd50ebd087e2acdfef343ee151d3eb3c08ef36a12": {
                "Name": "r-healthcheck-healthcheck-1-25d1232c",
                "EndpointID": "1ac346f1b5dcbab62069157cecd6e48374bdfccccfd134f7095521c92468a145",
                "MacAddress": "",
                "IPv4Address": "",
                "IPv6Address": ""
            },
            "4325ee9993ff277188a29d560aa43074814fcf1f23eed33da1a987db8c9bfb4f": {
                "Name": "r-scheduler-scheduler-1-789b7587",
                "EndpointID": "db40a6e645d28c6890b1c286e9de746945b6fdd586d0cb94dfcb554741370d72",
                "MacAddress": "",
                "IPv4Address": "",
                "IPv6Address": ""
            },
            "857a424c2332930858c75df8c39cdd3dc061d5ad886c370906dd30926ee3876a": {
                "Name": "r-ipsec-ipsec-1-5d4973c7",
                "EndpointID": "1b383dfd02e8601e375d066f2204f922a6e5c1e263bd2e844262cd6ea49ea307",
                "MacAddress": "",
                "IPv4Address": "",
                "IPv6Address": ""
            }
        },
        "Options": {},
        "Labels": {}
    }
]

So, not sure what else I can do from that point. There's a ipsec-cni-driver container (from Rancher) that is running, that I'm guessing does something here.

/cc @vincent99 @ibuildthecloud

2opremio commented 7 years ago

Related: https://github.com/weaveworks/scope/issues/1550

Answering myself:

I am asking because Scope works with Kubernetes (and CNI plugins), and the containers still show their IP with docker inspect

I think we've only ever tested the weave CNI plugin and we use other means to obtain the IPs in that case.

2opremio commented 7 years ago

Aha! Lacking a generic way to obtain the container IPs and Networks from CNI, this should be good enough:

$ docker inspect 41e077596904
[...]
                "io.rancher.container.ip": "10.42.140.196/16",
2opremio commented 7 years ago

@johnrengelman If you want to give it a try, in order to add the rancher IPs and Networks you would have to:

  1. Extend NetworkingInfo() to add the IP from theio.rancher.container.ip label (10.42.140.196 in the example above)
  2. Extend overlayTopology() to incorporate the networks found in all containers the rancher containers (transforming 10.42.140.196/16 from CIDR notation leads to a canonical network IP 10.42.0.0 with netmask 255.255.0.0).

EDIT: You will also need to incorporate a new prefix for Rancher (report.RancherOverlayPeerPrefix) and extend the parsing in ParseOverlayNodeID().

2opremio commented 7 years ago

An alternative approach (suggested by @bboreham ) which could more-generically cover CNI plugins in all platforms would be to exec into all the networking namespaces. Quoting him:

No, there is no 'GET' function for CNI [...] We could extract roughly the code that does that; it just walks through every namespace. You could write code for the common cases - Flannel, calico, whatever - to look for their characteristics. In 95% of cases what you're looking for is a veth called eth0.

2opremio commented 7 years ago

@johnrengelman Want to give it a try?

johnrengelman commented 7 years ago

@2opremio Yeah, I'm going to put it on my list for taking a crack at this week.

2opremio commented 7 years ago

Fantastic! Let me know if you need any help.

On Mon, Jan 30, 2017 at 7:42 PM, John Engelman notifications@github.com wrote:

@2opremio https://github.com/2opremio Yeah, I'm going to put it on my list for taking a crack at this week.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/weaveworks/scope/issues/2164#issuecomment-276151384, or mute the thread https://github.com/notifications/unsubscribe-auth/ACQOJMIlg6jqMKFJplJ3Kr4bNBX5UUumks5rXi8hgaJpZM4Lv59f .