monitoringartist / zabbix-docker-monitoring

:whale: Docker/Kubernetes/Mesos/Marathon/Chronos/LXC/LXD/Swarm container monitoring - Docker image, Zabbix template and C module
https://hub.docker.com/r/monitoringartist/zabbix-agent-xxl-limited/
GNU General Public License v2.0
1.19k stars 268 forks source link

Metric filepath in Kubernetes pod containers are not found. #94

Closed miaofenk closed 6 years ago

miaofenk commented 6 years ago

Hi,

I created a pod with zabbix-agent container in Kubernetes environment. In the container, I followed the instructions to load zabbix_module_docker.so. But when I checked the log, I found the following error:

    59:20180403:024305.142 Requested [docker.cpu[62f9ae6a06325a47436c150221be6968a43ce43998be2b1d237b9ce13b7b6358,total]]
    59:20180403:024305.142 In zbx_module_docker_cpu()
    59:20180403:024305.142 In zbx_module_docker_get_fci()
    59:20180403:024305.142 Original full container id will be used
    59:20180403:024305.142 Metric source file: /sys/fs/cgroup/cpu,cpuacct/system.slice/docker-62f9ae6a06325a47436c150221be6968a43ce43998be2b1d237b9ce13b7b6358.scope/cpuacct.stat
    59:20180403:024305.142 Cannot open metric file: '/sys/fs/cgroup/cpu,cpuacct/system.slice/docker-62f9ae6a06325a47436c150221be6968a43ce43998be2b1d237b9ce13b7b6358.scope/cpuacct.stat'
    59:20180403:024305.143 Sending back [ZBX_NOTSUPPORTED: Cannot open cpuacct.stat file]

When I find in /sys/fs/cgroup, I found the file with the same container ID under kubepods.slice:

./cpu,cpuacct/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-podd6d5290c_364b_11e8_8545_fa163ed7124a.slice/docker-62f9ae6a06325a47436c150221be6968a43ce43998be2b1d237b9ce13b7b6358.scope/cpu.stat
./cpu,cpuacct/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-podd6d5290c_364b_11e8_8545_fa163ed7124a.slice/docker-62f9ae6a06325a47436c150221be6968a43ce43998be2b1d237b9ce13b7b6358.scope/cpu.cfs_period_us

...

How could I modify it to make the module support kubernetes environment?

Thanks a lot!

jangaraj commented 6 years ago

It looks like your Docker uses --cgroup-parent configuration. See #85.

miaofenk commented 6 years ago

Jangaraj,

It's very similar but our issue is more complicated.

As you can see from the file fullpath: kubepods-besteffort.slice/ the "besteffort" is from qos class of a pod. kubepods-besteffort-podd6d5290c_364b_11e8_8545_fa163ed7124a.slice/ is the pod user id.

These values are not able to get before the pod is create.

jangaraj commented 6 years ago

Could you please provide docker inspect output of your container in the pod?

miaofenk commented 6 years ago

I recreated the pod.

[
    {
        "Id": "a433198674c9220d436fb62b9cb32834d447fa2f54643a481fa0c43850aa972b",
        "Created": "2018-04-03T09:49:48.894165411Z",
        "Path": "/usr/bin/entry.sh",
        "Args": [
            "/usr/bin/supervisord"
        ],
        "State": {
            "Status": "running",
            "Running": true,
            "Paused": false,
            "Restarting": false,
            "OOMKilled": false,
            "Dead": false,
            "Pid": 968,
            "ExitCode": 0,
            "Error": "",
            "StartedAt": "2018-04-03T09:49:49.027749798Z",
            "FinishedAt": "0001-01-01T00:00:00Z"
        },
        "Image": "sha256:aa29dc85ca8f9939c3367a7596ba53c03550b52bec63e1b5c56ab48adb1c5ae1",
        "ResolvConfPath": "/data0/docker/containers/2ebf32dce8d0da2d59d4054393157e6c61149182f9696b1e21ffbbf67472bf6d/resolv.conf",
        "HostnamePath": "/data0/docker/containers/2ebf32dce8d0da2d59d4054393157e6c61149182f9696b1e21ffbbf67472bf6d/hostname",
        "HostsPath": "/var/lib/kubelet/pods/58d7c41b-3724-11e8-8545-fa163ed7124a/etc-hosts",
        "LogPath": "",
        "Name": "/k8s_zabbix-agent_zabbix-agent-deployment-tv27r_default_58d7c41b-3724-11e8-8545-fa163ed7124a_0",
        "RestartCount": 0,
        "Driver": "overlay2",
        "MountLabel": "",
        "ProcessLabel": "",
        "AppArmorProfile": "",
        "ExecIDs": null,
        "HostConfig": {
            "Binds": [
                "/sys:/sys",
                "/var/lib/kubelet/pods/58d7c41b-3724-11e8-8545-fa163ed7124a/volumes/kubernetes.io~secret/default-token-cf2tf:/var/run/secrets/kubernetes.io/serviceaccount:ro,Z",
                "/var/lib/kubelet/pods/58d7c41b-3724-11e8-8545-fa163ed7124a/etc-hosts:/etc/hosts:Z",
                "/var/lib/kubelet/pods/58d7c41b-3724-11e8-8545-fa163ed7124a/containers/zabbix-agent/c7b11e02:/dev/termination-log:Z"
            ],
            "ContainerIDFile": "",
            "LogConfig": {
                "Type": "journald",
                "Config": {}
            },
            "NetworkMode": "container:2ebf32dce8d0da2d59d4054393157e6c61149182f9696b1e21ffbbf67472bf6d",
            "PortBindings": null,
            "RestartPolicy": {
                "Name": "",
                "MaximumRetryCount": 0
            },
            "AutoRemove": false,
            "VolumeDriver": "",
            "VolumesFrom": null,
            "CapAdd": null,
            "CapDrop": null,
            "Dns": null,
            "DnsOptions": null,
            "DnsSearch": null,
            "ExtraHosts": null,
            "GroupAdd": null,
            "IpcMode": "container:2ebf32dce8d0da2d59d4054393157e6c61149182f9696b1e21ffbbf67472bf6d",
            "Cgroup": "",
            "Links": null,
            "OomScoreAdj": 1000,
            "PidMode": "",
            "Privileged": true,
            "PublishAllPorts": false,
            "ReadonlyRootfs": false,
            "SecurityOpt": [
                "seccomp=unconfined",
                "label=disable"
            ],
            "UTSMode": "host",
            "UsernsMode": "",
            "ShmSize": 67108864,
            "Runtime": "docker-runc",
            "ConsoleSize": [
                0,
                0
            ],
            "Isolation": "",
            "CpuShares": 2,
            "Memory": 0,
            "NanoCpus": 0,
            "CgroupParent": "kubepods-besteffort-pod58d7c41b_3724_11e8_8545_fa163ed7124a.slice",
            "BlkioWeight": 0,
            "BlkioWeightDevice": null,
            "BlkioDeviceReadBps": null,
            "BlkioDeviceWriteBps": null,
            "BlkioDeviceReadIOps": null,
            "BlkioDeviceWriteIOps": null,
            "CpuPeriod": 0,
            "CpuQuota": 0,
            "CpuRealtimePeriod": 0,
            "CpuRealtimeRuntime": 0,
            "CpusetCpus": "",
            "CpusetMems": "",
            "Devices": [],
            "DiskQuota": 0,
            "KernelMemory": 0,
            "MemoryReservation": 0,
            "MemorySwap": 0,
            "MemorySwappiness": -1,
            "OomKillDisable": false,
            "PidsLimit": 0,
            "Ulimits": null,
            "CpuCount": 0,
            "CpuPercent": 0,
            "IOMaximumIOps": 0,
            "IOMaximumBandwidth": 0
        },
        "GraphDriver": {
            "Name": "overlay2",
            "Data": {
                "LowerDir": "/data0/docker/overlay2/f84bd667b6e71d8b866583156e5e446c4adfb30153892652b255bdd805f7d0cd-init/diff:/data0/docker/overlay2/a05389739dbe9b4c8ac6b97d277cdccf16884e5ce57a10cad72da4d9d59edeb2/diff:/data0/docker/overlay2/63f342c8c2aa861a5954f70cfb545252fac3e88fab02d734a6ff08708c6a7b31/diff:/data0/docker/overlay2/afb6b353587ff82d6399134db0b2980ffcb748b432d423e3d2149010e9148022/diff:/data0/docker/overlay2/6683d7eb823e43a91dcb3d22c0d34faf335d0cc0d2af01bee23344d8d8862d87/diff:/data0/docker/overlay2/d0f075545ab0f35f5dabe05a0bcee165b48a2509e188d9b84cc944f004273ec5/diff:/data0/docker/overlay2/3ad06662ac181ad6e857566b7a4d81540fab6f205fa6ba5de133e900da3a3449/diff:/data0/docker/overlay2/b9bec8b86d004860bb9dd68d6daa405e8c7238a98b80e0b416c456fe498bc39e/diff:/data0/docker/overlay2/1d65be77a9663b379edcd034de2628a9f29e973a50e3c91d44ccb6e6496555cb/diff:/data0/docker/overlay2/106a00cc16dc1f0b40ea3169c2f31622fdf18db9146112b99c33f1cd06c2d974/diff:/data0/docker/overlay2/940d3fe72009164e162dfa1823356ef6bd09e83b27d30f7137710610e0d88842/diff:/data0/docker/overlay2/3a864d37812447c1d2230cddf97f066788c09f3064f4f5a4f181e7205b501a6a/diff:/data0/docker/overlay2/cff7c7de314ee898c0edeac51974b5af29e18ed9c15f41a81c206468f80fad17/diff:/data0/docker/overlay2/c11998ed50dfc26421b6242361ae59c643ee5bac6ce2ce0f38d208bc72fd34e5/diff",
                "MergedDir": "/data0/docker/overlay2/f84bd667b6e71d8b866583156e5e446c4adfb30153892652b255bdd805f7d0cd/merged",
                "UpperDir": "/data0/docker/overlay2/f84bd667b6e71d8b866583156e5e446c4adfb30153892652b255bdd805f7d0cd/diff",
                "WorkDir": "/data0/docker/overlay2/f84bd667b6e71d8b866583156e5e446c4adfb30153892652b255bdd805f7d0cd/work"
            }
        },
        "Mounts": [
            {
                "Type": "bind",
                "Source": "/sys",
                "Destination": "/sys",
                "Mode": "",
                "RW": true,
                "Propagation": "rprivate"
            },
            {
                "Type": "bind",
                "Source": "/var/lib/kubelet/pods/58d7c41b-3724-11e8-8545-fa163ed7124a/volumes/kubernetes.io~secret/default-token-cf2tf",
                "Destination": "/var/run/secrets/kubernetes.io/serviceaccount",
                "Mode": "ro,Z",
                "RW": false,
                "Propagation": "rprivate"
            },
            {
                "Type": "bind",
                "Source": "/var/lib/kubelet/pods/58d7c41b-3724-11e8-8545-fa163ed7124a/etc-hosts",
                "Destination": "/etc/hosts",
                "Mode": "Z",
                "RW": true,
                "Propagation": "rprivate"
            },
            {
                "Type": "bind",
                "Source": "/var/lib/kubelet/pods/58d7c41b-3724-11e8-8545-fa163ed7124a/containers/zabbix-agent/c7b11e02",
                "Destination": "/dev/termination-log",
                "Mode": "Z",
                "RW": true,
                "Propagation": "rprivate"
            }
        ],
        "Config": {
            "Hostname": "shenyonh-bcmt-worker-01",
            "Domainname": "",
            "User": "0",
            "AttachStdin": false,
            "AttachStdout": false,
            "AttachStderr": false,
            "Tty": false,
            "OpenStdin": false,
            "StdinOnce": false,
            "Env": [

            ],
            "Cmd": [
                "/usr/bin/entry.sh",
                "/usr/bin/supervisord"
            ],
            "Healthcheck": {
                "Test": [
                    "NONE"
                ]
            },
            "ArgsEscaped": true,
            "Image": "172.16.1.107:5000/zabbix-agent@sha256:ab4526b246cc964273e03d6ed8a317f6ced2f2c1c53e18be7b8eb40def0da53e",
            "Volumes": null,
            "WorkingDir": "",
            "Entrypoint": null,
            "OnBuild": null,
            "Labels": {
                "annotation.io.kubernetes.container.hash": "5d084df1",
                "annotation.io.kubernetes.container.ports": "[{\"hostPort\":10050,\"containerPort\":10050,\"protocol\":\"TCP\"}]",
                "annotation.io.kubernetes.container.restartCount": "0",
                "annotation.io.kubernetes.container.terminationMessagePath": "/dev/termination-log",
                "annotation.io.kubernetes.container.terminationMessagePolicy": "File",
                "annotation.io.kubernetes.pod.terminationGracePeriod": "30",
                "io.kubernetes.container.logpath": "/var/log/pods/58d7c41b-3724-11e8-8545-fa163ed7124a/zabbix-agent_0.log",
                "io.kubernetes.container.name": "zabbix-agent",
                "io.kubernetes.docker.type": "container",
                "io.kubernetes.pod.name": "zabbix-agent-deployment-tv27r",
                "io.kubernetes.pod.namespace": "default",
                "io.kubernetes.pod.uid": "58d7c41b-3724-11e8-8545-fa163ed7124a",
                "io.kubernetes.sandbox.id": "2ebf32dce8d0da2d59d4054393157e6c61149182f9696b1e21ffbbf67472bf6d"
            }
        },
        "NetworkSettings": {
            "Bridge": "",
            "SandboxID": "",
            "HairpinMode": false,
            "LinkLocalIPv6Address": "",
            "LinkLocalIPv6PrefixLen": 0,
            "Ports": null,
            "SandboxKey": "",
            "SecondaryIPAddresses": null,
            "SecondaryIPv6Addresses": null,
            "EndpointID": "",
            "Gateway": "",
            "GlobalIPv6Address": "",
            "GlobalIPv6PrefixLen": 0,
            "IPAddress": "",
            "IPPrefixLen": 0,
            "IPv6Gateway": "",
            "MacAddress": "",
            "Networks": {}
        }
    }
]
jangaraj commented 6 years ago

Yes, cgrouParent is there:

"CgroupParent": "kubepods-besteffort-pod58d7c41b_3724_11e8_8545_fa163ed7124a.slice",

It's the same problem as #85. There is WIP branch https://github.com/monitoringartist/zabbix-docker-monitoring/tree/cgroup-parent. The module will read CgroupParent value from the Docker API and it will use it to construct right cgroup filepath. PR is welcomed.

miaofenk commented 6 years ago

@jangaraj ,

Thanks a lot! I will try this version.

jangaraj commented 6 years ago

It's WIP = work in progress. It doesn't work without additional devel/test work atm.

miaofenk commented 6 years ago

I see. I will check the code first and test if it works.

miaofenk commented 6 years ago

@jangaraj May I know if the CgroupParent branch code has matched OS or zabbix version? Since we are using CentOS 7 and the zabbix version is 3.0.13, do I need extra development to adapt the environment?

jangaraj commented 6 years ago

No/Yes. No - module code is OS/Zabbix version independent. Yes - there is a bug at the moment https://github.com/monitoringartist/zabbix-docker-monitoring/issues/89, which you will face for your Zabbix 3.0.

miaofenk commented 6 years ago

All right. It seems that the bug was still not fixed yet. Do you have any suggestions about the bug fix before compilation?

jangaraj commented 6 years ago

nope