agent status
Getting the status from the agent.
===============
Agent (v7.18.1)
===============
Status date: 2020-04-27 18:55:55.977281 UTC
Agent start: 2020-04-27 18:32:05.470769 UTC
Pid: 370
Go Version: go1.12.9
Python Version: 3.8.1
Build arch: amd64
Check Runners: 4
Log Level: info
Paths
=====
Config File: /etc/datadog-agent/datadog.yaml
conf.d: /etc/datadog-agent/conf.d
checks.d: /etc/datadog-agent/checks.d
Clocks
======
System UTC time: 2020-04-27 18:55:55.977281 UTC
Host Info
=========
bootTime: 2020-02-05 22:21:02.000000 UTC
kernelVersion: 5.5.0-1.el7.elrepo.x86_64
os: linux
platform: debian
platformFamily: debian
platformVersion: bullseye/sid
procs: 73
uptime: 1964h11m9s
virtualizationRole: guest
virtualizationSystem: docker
Hostnames
=========
host_aliases: [den-iac-opstest-kube-node01.ops-test.clh-int.com]
hostname: den-iac-opstest-kube-node01.ops-test.clh-int.com
socket-fqdn: 7b78fcc56df2
socket-hostname: 7b78fcc56df2
host tags:
environment:ops-test
owner:iac
agent_type:node
docker_swarm_node_role:manager
hostname provider: container
unused hostname providers:
aws: not retrieving hostname from AWS: the host is not an ECS instance, and other providers already retrieve non-default hostnames
configuration/environment: hostname is empty
gce: unable to retrieve hostname from GCE: Get http://169.254.169.254/computeMetadata/v1/instance/hostname: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
Metadata
========
hostname_source: container
=========
Collector
=========
Running Checks
==============
cpu
---
Instance ID: cpu [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/cpu.d/conf.yaml.default
Total Runs: 95
Metric Samples: Last Run: 6, Total: 564
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 0s
Last Execution Date : 2020-04-27 18:55:41.000000 UTC
Last Successful Execution Date : 2020-04-27 18:55:41.000000 UTC
disk (2.7.0)
------------
Instance ID: disk:e5dffb8bef24336f [ERROR]
Configuration Source: file:/etc/datadog-agent/conf.d/disk.d/conf.yaml.default
Total Runs: 95
Metric Samples: Last Run: 640, Total: 60,800
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 249ms
Last Execution Date : 2020-04-27 18:55:48.000000 UTC
Last Successful Execution Date : Never
Error: not sure how to interpret line ' 8 0 sda 45066 9000 10004532 512126 138429438 193746244 3432370168 440317261 0 127455390 368574478 0 0 0 0 0 0\n'
Traceback (most recent call last):
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/checks/base.py", line 713, in run
self.check(instance)
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/disk/disk.py", line 121, in check
self.collect_latency_metrics()
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/disk/disk.py", line 244, in collect_latency_metrics
for disk_name, disk in iteritems(psutil.disk_io_counters(True)):
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/psutil/__init__.py", line 2168, in disk_io_counters
rawdict = _psplatform.disk_io_counters(**kwargs)
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/psutil/_pslinux.py", line 1125, in disk_io_counters
for entry in gen:
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/psutil/_pslinux.py", line 1098, in read_procfs
raise ValueError("not sure how to interpret line %r" % line)
ValueError: not sure how to interpret line ' 8 0 sda 45066 9000 10004532 512126 138429438 193746244 3432370168 440317261 0 127455390 368574478 0 0 0 0 0 0\n'
docker
------
Instance ID: docker [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/docker.d/conf.yaml.default
Total Runs: 95
Metric Samples: Last Run: 355, Total: 33,685
Events: Last Run: 0, Total: 3
Service Checks: Last Run: 1, Total: 95
Average Execution Time : 166ms
Last Execution Date : 2020-04-27 18:55:55.000000 UTC
Last Successful Execution Date : 2020-04-27 18:55:55.000000 UTC
file_handle
-----------
Instance ID: file_handle [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/file_handle.d/conf.yaml.default
Total Runs: 95
Metric Samples: Last Run: 5, Total: 475
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 0s
Last Execution Date : 2020-04-27 18:55:47.000000 UTC
Last Successful Execution Date : 2020-04-27 18:55:47.000000 UTC
io
--
Instance ID: io [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/io.d/conf.yaml.default
Total Runs: 95
Metric Samples: Last Run: 52, Total: 4,904
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 0s
Last Execution Date : 2020-04-27 18:55:54.000000 UTC
Last Successful Execution Date : 2020-04-27 18:55:54.000000 UTC
kubelet (3.6.0)
---------------
Instance ID: kubelet:d884b5186b651429 [ERROR]
Configuration Source: file:/etc/datadog-agent/conf.d/kubelet.d/conf.yaml.default
Total Runs: 95
Metric Samples: Last Run: 19, Total: 1,805
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 4, Total: 380
Average Execution Time : 7.356s
Last Execution Date : 2020-04-27 18:55:53.000000 UTC
Last Successful Execution Date : Never
Error: HTTPSConnectionPool(host='169.254.1.1', port=10250): Max retries exceeded with url: /metrics/cadvisor (Caused by ProxyError('Cannot connect to proxy.', OSError('Tunnel connection failed: 503 Service Unavailable')))
Traceback (most recent call last):
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/connectionpool.py", line 662, in urlopen
self._prepare_proxy(conn)
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/connectionpool.py", line 948, in _prepare_proxy
conn.connect()
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/connection.py", line 308, in connect
self._tunnel()
File "/opt/datadog-agent/embedded/lib/python3.8/http/client.py", line 898, in _tunnel
raise OSError("Tunnel connection failed: %d %s" % (code,
OSError: Tunnel connection failed: 503 Service Unavailable
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/adapters.py", line 439, in send
resp = conn.urlopen(
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/connectionpool.py", line 719, in urlopen
retries = retries.increment(
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/util/retry.py", line 436, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='169.254.1.1', port=10250): Max retries exceeded with url: /metrics/cadvisor (Caused by ProxyError('Cannot connect to proxy.', OSError('Tunnel connection failed: 503 Service Unavailable')))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/checks/base.py", line 713, in run
self.check(instance)
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/kubelet/kubelet.py", line 349, in check
self.process(self.cadvisor_scraper_config, metric_transformers=self.transformers)
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/checks/openmetrics/mixins.py", line 443, in process
for metric in self.scrape_metrics(scraper_config):
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/checks/openmetrics/mixins.py", line 401, in scrape_metrics
response = self.poll(scraper_config)
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/checks/openmetrics/mixins.py", line 605, in poll
response = self.send_request(endpoint, scraper_config, headers)
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/checks/openmetrics/mixins.py", line 631, in send_request
return http_handler.get(endpoint, stream=True, **kwargs)
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/utils/http.py", line 277, in get
return self._request('get', url, options)
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/utils/http.py", line 319, in _request
return getattr(requests, method)(url, **new_options)
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/api.py", line 75, in get
return request('get', url, params=params, **kwargs)
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/api.py", line 60, in request
return session.request(method=method, url=url, **kwargs)
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/sessions.py", line 533, in request
resp = self.send(prep, **send_kwargs)
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/sessions.py", line 646, in send
r = adapter.send(request, **kwargs)
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/adapters.py", line 510, in send
raise ProxyError(e, request=request)
requests.exceptions.ProxyError: HTTPSConnectionPool(host='169.254.1.1', port=10250): Max retries exceeded with url: /metrics/cadvisor (Caused by ProxyError('Cannot connect to proxy.', OSError('Tunnel connection failed: 503 Service Unavailable')))
load
----
Instance ID: load [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/load.d/conf.yaml.default
Total Runs: 95
Metric Samples: Last Run: 6, Total: 570
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 0s
Last Execution Date : 2020-04-27 18:55:53.000000 UTC
Last Successful Execution Date : 2020-04-27 18:55:53.000000 UTC
memory
------
Instance ID: memory [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/memory.d/conf.yaml.default
Total Runs: 95
Metric Samples: Last Run: 17, Total: 1,615
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 0s
Last Execution Date : 2020-04-27 18:55:45.000000 UTC
Last Successful Execution Date : 2020-04-27 18:55:45.000000 UTC
network (1.14.0)
----------------
Instance ID: network:e0204ad63d43c949 [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/network.d/conf.yaml.default
Total Runs: 95
Metric Samples: Last Run: 37, Total: 3,329
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 7ms
Last Execution Date : 2020-04-27 18:55:52.000000 UTC
Last Successful Execution Date : 2020-04-27 18:55:52.000000 UTC
ntp
---
Instance ID: ntp:d884b5186b651429 [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/ntp.d/conf.yaml.default
Total Runs: 2
Metric Samples: Last Run: 0, Total: 0
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 1, Total: 2
Average Execution Time : 20.026s
Last Execution Date : 2020-04-27 18:47:31.000000 UTC
Last Successful Execution Date : 2020-04-27 18:47:31.000000 UTC
uptime
------
Instance ID: uptime [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/uptime.d/conf.yaml.default
Total Runs: 95
Metric Samples: Last Run: 1, Total: 95
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 0s
Last Execution Date : 2020-04-27 18:55:44.000000 UTC
Last Successful Execution Date : 2020-04-27 18:55:44.000000 UTC
========
JMXFetch
========
Initialized checks
==================
no checks
Failed checks
=============
no checks
=========
Forwarder
=========
Transactions
============
CheckRunsV1: 95
Dropped: 0
DroppedOnInput: 0
Events: 0
HostMetadata: 0
IntakeV1: 12
Metadata: 0
Requeued: 0
Retried: 0
RetryQueueSize: 0
Series: 0
ServiceChecks: 0
SketchSeries: 0
Success: 202
TimeseriesV1: 95
API Keys status
===============
API key ending with e724a: API Key valid
==========
Endpoints
==========
https://app.datadoghq.com - API Key ending with:
- e724a
==========
Logs Agent
==========
Logs Agent is not running
=========
Aggregator
=========
Checks Metric Sample: 109,745
Dogstatsd Metric Sample: 6,786
Event: 4
Events Flushed: 4
Number Of Flushes: 95
Series Flushed: 54,308
Service Check: 1,429
Service Checks Flushed: 1,519
=========
DogStatsD
=========
Event Packets: 0
Event Parse Errors: 0
Metric Packets: 6,785
Metric Parse Errors: 0
Service Check Packets: 0
Service Check Parse Errors: 0
Udp Bytes: 435,340
Udp Packet Reading Errors: 0
Udp Packets: 6,786
Uds Bytes: 0
Uds Origin Detection Errors: 0
Uds Packet Reading Errors: 0
Uds Packets: 0
=====================
Datadog Cluster Agent
=====================
- Datadog Cluster Agent endpoint detected: https://datadog-cluster-agent.ops-test.kube.ch.int
Successfully connected to the Datadog Cluster Agent.
- Running: 1.5.2+commit.60ee741
Describe what happened:
When I upgraded my datadog agent from 6.13.0 to 7.18.1 kubelet and kubernetes_state checks started failing with 503 Service Unavailable python errors. When I revert back to 6.13.0 the checks work fine. I also see disk checks failing which is broken in 6.13.0 too.
Describe what you expected:
kubelet, disk and kubernetes_state check to be OK
Steps to reproduce the issue:
Configure kubelet and kubernetes_state checks on agent 7.18.1.
Output of the info page (if this is a bug)
Describe what happened: When I upgraded my datadog agent from
6.13.0
to7.18.1
kubelet and kubernetes_state checks started failing with 503 Service Unavailable python errors. When I revert back to6.13.0
the checks work fine. I also see disk checks failing which is broken in 6.13.0 too.Describe what you expected: kubelet, disk and kubernetes_state check to be OK
Steps to reproduce the issue: Configure kubelet and kubernetes_state checks on agent 7.18.1.
Additional environment details (Operating System, Cloud provider, etc):