vivanov-dp / zabbix-docker-template

Zabbix 5.x docker template for agent ver. 1, with containers and images LLD
18 stars 7 forks source link

Key docker.api Unsupported item key #3

Closed asm19 closed 4 years ago

asm19 commented 4 years ago

Hello,

I can't understand what the problem is, I made the configuration as it is in the Setup but the key docker.api[{$DOCKER.SOCKET},_ping] gives error "Unsupported item key". I don't change the default macro: /var/run/docker.sock

image

Procedure: cp docker_template.conf /etc/zabbix/zabbix_agentd.d/ sudo usermod -a -G docker zabbix restart Zabbix Agent import docker_template_agent1.xml into Zabbix templates This will create a template named Template App Docker - Agent 1 attach the new template to hosts to start monitoring configure macros as needed {$DOCKER.SOCKET} has to point to docker's unix socket (default is /var/run/docker.sock)

image

Can you help?

Thank you

vivanov-dp commented 4 years ago

Hi @asm19

This is really strange since everything else seems to work. Can you tell me which version of docker are you using ? Also try the following command on the host you are monitoring: curl -s --unix-socket /var/run/docker.sock http://latest/_ping , as user zabbix if possible, and paste the result here

asm19 commented 4 years ago

It's ok with user zabbix:

image

Docker version 19.03.13, build 4484c46d9d

My zabbix version: 5.0.4

vivanov-dp commented 4 years ago

Alright, the docker API call works, so the problem is somewhere within the Zabbix setup.

There are log files both on the client and on the server, usually named /var/log/zabbix/zabbix_agentd.log and /var/log/zabbix/zabbix_server.log. Could you take a look and find anything that mentions the item Docker: Ping or something like became unsupported ? There should be a reason for it specified there.

asm19 commented 4 years ago

Already checked, nothing "obvious":

Zabbix server, /var/log/zabbix/zabbix_server.log:

863410:20201007:204545.747 discovery rule "XXXXXXXXXX:docker.containers.list[{$DOCKER.SOCKET}]" became not supported: Unsupported item key. 863411:20201007:204545.747 discovery rule "XXXXXXXXXX:docker.api[{$DOCKER.SOCKET},images/json]" became not supported: Unsupported item key. 863249:20201007:204636.443 item "XXXXXXXXXX:docker.api[{$DOCKER.SOCKET},_ping]" became not supported: Unsupported item key. 863252:20201007:204640.458 item "XXXXXXXXXX:docker.api[{$DOCKER.SOCKET},info]" became not supported: Unsupported item key. 863250:20201007:204641.465 item "XXXXXXXXXX:docker.api[{$DOCKER.SOCKET},system/df]" became not supported: Unsupported item key. 863250:20201007:204641.465 item "XXXXXXXXXX:docker.api[{$DOCKER.SOCKET},version]" became not supported: Unsupported item key.

On the client I don't have any log with "docker".

asm19 commented 4 years ago

Uncomment this line:

UnsafeUserParameters=1

Now work but have others problems:

863251:20201007:205626.029 item "XXXXXXXXXX:docker.api[{$DOCKER.SOCKET},_ping]" became supported 863252:20201007:205639.755 item "XXXXXXXXXX:docker.api[{$DOCKER.SOCKET},info]" became supported 863252:20201007:205639.755 item "XXXXXXXXXX:docker.root_dir" became not supported: Preprocessing failed for: 863252:20201007:205639.755 item "XXXXXXXXXX:docker.nfd" became not supported: Preprocessing failed for: 863252:20201007:205639.755 item "XXXXXXXXXX:docker.containers.total" became not supported: Preprocessing failed for: 863252:20201007:205639.755 item "XXXXXXXXXX:docker.name" became not supported: Preprocessing failed for: 863252:20201007:205639.755 item "XXXXXXXXXX:docker.info.text" became not supported: Preprocessing failed for: 863252:20201007:205639.755 item "XXXXXXXXXX:docker.goroutines" became not supported: Preprocessing failed for: 863252:20201007:205639.755 item "XXXXXXXXXX:docker.debug.enabled" became not supported: Preprocessing failed for: 863252:20201007:205639.755 item "XXXXXXXXXX:docker.containers.stopped" became not supported: Preprocessing failed for: 863252:20201007:205639.755 item "XXXXXXXXXX:docker.containers.running" became not supported: Preprocessing failed for: 863252:20201007:205639.755 item "XXXXXXXXXX:docker.containers.paused" became not supported: Preprocessing failed for: 863250:20201007:205640.756 item "XXXXXXXXXX:docker.api[{$DOCKER.SOCKET},system/df]" became supported 863250:20201007:205640.756 item "XXXXXXXXXX:docker.volumes_size" became not supported: Preprocessing failed for: 863250:20201007:205640.756 item "XXXXXXXXXX:docker.images_size" became not supported: Preprocessing failed for: 863250:20201007:205640.756 item "XXXXXXXXXX:docker.containers_size" became not supported: Preprocessing failed for: 863250:20201007:205640.756 item "XXXXXXXXXX:docker.layers_size" became not supported: Preprocessing failed for: 863250:20201007:205641.766 item "XXXXXXXXXX:docker.info.version_text" became not supported: Preprocessing failed for: 863250:20201007:205641.766 item "XXXXXXXXXX:docker.version" became not supported: Preprocessing failed for: 863250:20201007:205641.766 item "XXXXXXXXXX:docker.api_version" became not supported: Preprocessing failed for: 863250:20201007:205641.766 item "XXXXXXXXXX:docker.api[{$DOCKER.SOCKET},version]" became supported

Example: image

vivanov-dp commented 4 years ago

Hmm .. yes, UnsafeUserParameters=1 might be required - I'm running with it switched on and haven't considered it at all, since I always check any parameters and mods that I add to the zabbix client from external sources.

The message you get now means that Zabbix is not receiving the proper JSON objects from the docker API. Can you get the JSON value from one of the items that misbehave, for example Template App Docker - Agent 1 - Gamesrv: Docker: Get info ? Go to the item in the host configuration (not the template), and click test, then get value

items

See if the received value is a valid JSON and contains the properties that are mentioned in the error messages. The Get Info item should return something looking like this:

{
    "ID": "KVIT:YKKL:DBX7:YTM2:SSHB:BFLQ:GX5J:MP37:QXNY:SRRY:PCN2:6DQH",
    "Containers": 13,
    "ContainersRunning": 13,
    "ContainersPaused": 0,
    "ContainersStopped": 0,
    "Images": 6,
    "Driver": "overlay2",
    "DriverStatus": [
        [
            "Backing Filesystem",
            "extfs"
        ],
        [
            "Supports d_type",
            "true"
        ],
        [
            "Native Overlay Diff",
            "true"
        ]
    ],
    "SystemStatus": null,
    "Plugins": {
        "Volume": [
            "local"
        ],
        "Network": [
            "bridge",
            "host",
            "macvlan",
            "null",
            "overlay"
        ],
        "Authorization": null,
        "Log": [
            "awslogs",
            "fluentd",
            "gcplogs",
            "gelf",
            "journald",
            "json-file",
            "local",
            "logentries",
            "splunk",
            "syslog"
        ]
    },
    "MemoryLimit": true,
    "SwapLimit": true,
    "KernelMemory": true,
    "CpuCfsPeriod": true,
    "CpuCfsQuota": true,
    "CPUShares": true,
    "CPUSet": true,
    "IPv4Forwarding": true,
    "BridgeNfIptables": true,
    "BridgeNfIp6tables": true,
    "Debug": false,
    "NFd": 131,
    "OomKillDisable": true,
    "NGoroutines": 152,
    "SystemTime": "2020-10-09T08:27:40.106333171Z",
    "LoggingDriver": "json-file",
    "CgroupDriver": "cgroupfs",
    "NEventsListener": 1,
    "OSType": "linux",
    "Architecture": "x86_64",
    "IndexServerAddress": "https://index.docker.io/v1/",
    "RegistryConfig": {
        "AllowNondistributableArtifactsCIDRs": [],
        "AllowNondistributableArtifactsHostnames": [],
        "InsecureRegistryCIDRs": [
            "127.0.0.0/8"
        ],
        "IndexConfigs": {
            "docker.io": {
                "Name": "docker.io",
                "Mirrors": [],
                "Secure": true,
                "Official": true
            }
        },
        "Mirrors": []
    },
    "NCPU": 2,
    "MemTotal": 8141873152,
    "GenericResources": null,
    "DockerRootDir": "/var/lib/docker",
    "HttpProxy": "",
    "HttpsProxy": "",
    "NoProxy": "",
    "Name": "ip-10-10-3-95.ec2.internal",
    "Labels": [],
    "ExperimentalBuild": false,
    "ServerVersion": "18.09.9-ce",
    "ClusterStore": "",
    "ClusterAdvertise": "",
    "Runtimes": {
        "runc": {
            "path": "runc"
        }
    },
    "DefaultRuntime": "runc",
    "Swarm": {
        "NodeID": "",
        "NodeAddr": "",
        "LocalNodeState": "inactive",
        "ControlAvailable": false,
        "Error": "",
        "RemoteManagers": null
    },
    "LiveRestoreEnabled": false,
    "Isolation": "",
    "InitBinary": "docker-init",
    "ContainerdCommit": {
        "ID": "894b81a4b802e4eb2a91d1ce216b8817763c29fb",
        "Expected": "894b81a4b802e4eb2a91d1ce216b8817763c29fb"
    },
    "RuncCommit": {
        "ID": "2b18fe1d885ee5083ef9f0838fee39b62d653e30",
        "Expected": "2b18fe1d885ee5083ef9f0838fee39b62d653e30"
    },
    "InitCommit": {
        "ID": "fec3683",
        "Expected": "fec3683"
    },
    "SecurityOptions": [
        "name=seccomp,profile=default"
    ],
    "Warnings": null
}

You can check it vs the docker API directly by using curl: curl -s --unix-socket /var/run/docker.sock http://latest/info for example, but I suspect that works just fine and the problem is with something in the Zabbix configuration.

asm19 commented 4 years ago

Hello @vivanov-dp,

In Discovery Rules if I run one, for example "Template App Docker - Agent 1: Containers discovery" I have this errors:

image

In the Items, for example for "Template App Docker - Agent 1: Docker: Get info: Docker: Name":

image

It seems that zabbix cannot generate the json file. If I run curl -s --unix-socket /var/run/docker.sock http://latest/info in the client host works without problems.

image

vivanov-dp commented 4 years ago

Can you try it with zabbix-get ? On your server (or the proxy, if you use one): zabbix_get -s IP.IP.IP.IP -k docker.api[/var/run/docker.sock,info]

asm19 commented 4 years ago

No errors or result:

image

vivanov-dp commented 4 years ago

This means that the Zabbix agent doesn't return the value at all - it should look exactly the same as when you call curl directly. Try to find what is wrong with the agent .. I will try to think of something to help later, if you don't find a fix in the meantime

asm19 commented 4 years ago

It's strange because I have the last version:

image

Passive mode.

Scandinav21 commented 4 years ago

I have the similar problem. All items is unsupported but ping is down.

zabbix_get -s IP.IP.IP.IP -k docker.api[/var/run/docker.sock,info] returns nothing

zabbix agent is the latest version

asm19 commented 4 years ago

Yes, I have the similar problem.

vivanov-dp commented 4 years ago

Hi @Scandinav21, @asm19 It's some kind of configuration problem with the host that's running the Zabbix Agent. Are you sure you have added user zabbix to the docker group and /var/run/docker.sock is readable by the group ?

Test with sudo -u zabbix curl -s --unix-socket /var/run/docker.sock http://latest/info ( @asm19 : curl -u zabbix ... does something different - it authenticates with user zabbix to the server it is contacting (docker API in this case), but what you want to do is run the curl process as user zabbix, sorry for not catching that earlier )

If this returns a result, zabbix_get should also return it

asm19 commented 4 years ago

@vivanov-dp

image

With zabbix_get -s IP.IP.IP.IP -k docker.api[/var/run/docker.sock,info] from zabbix server I don't have any result.

vivanov-dp commented 4 years ago

Is it possible that zabbix-agent runs as different user on your system ? I'm unable to replicate your problem, and if that's not it, I'm out of ideas ...

asm19 commented 4 years ago

Run with root user in Zabbix server:

image

In docker server:

image

I cannot understand the problem also.

Scandinav21 commented 4 years ago

Acnkoweledge, @asm19 sudo -u zabbix curl -s --unix-socket /var/run/docker.sock http://latest/info returns data, its okey, zabbix_agent is running by zabbix user. So me to out of ideas.

@asm19 whats the name of your system? Im using CentOS 7 and 8.

asm19 commented 4 years ago

@Scandinav21 CentOS Linux release 8.2.2004 (Core)

Scandinav21 commented 4 years ago

@asm19 I've figured out when run zabbix_agent from root the issue has gone. So maybe zabbix user has no some kind rights

UPD. It looks like I've found workaround:

  1. start zabbix_agent from root
  2. Undo previous changes and restart zabbig_agent from zabbix user and it works,

but may be somebody can find the issue core.

asm19 commented 4 years ago

@Scandinav21 You change it in Zabbix Server or in Docker server? The first time I change in docker server without result, in the second time I try in Zabbix Server but without result also.

Scandinav21 commented 4 years ago

@asm19 I've found the solution. In my case it was SELinux. I've found in audit logs _SELinux is preventing /usr/bin/curl from connectto access on the unix_streamsocket /run/docker.sock Then I used sealert -a /var/log/audit/audit.log and found the solution

ausearch -c 'curl' --raw | audit2allow -M my-curl
semodule -i my-curl.pp

And it works for me.

asm19 commented 4 years ago

In zabbix server, right? I will check

Scandinav21 commented 4 years ago

In zabbix server, right?

In server where docker and zabbix_agent is running

asm19 commented 4 years ago

You are right, the problem is in SELinux. I solved the problem with your suggestion.

vivanov-dp commented 4 years ago

@Scandinav21 Great work! Thanks for sharing your solution

vivanov-dp commented 4 years ago

I'm closing this, since the problem seems to have been solved. I'm sorry I can't confirm it on my side, but all hosts I have are running without SELinux. I have left a link to this issue in the documentation, so people can find it.