`sudo datadog-agent status` output
```text
Getting the status from the agent.
===============
Agent (v7.45.0)
===============
Status date: 2023-06-11 19:15:26.58 UTC (1686510926580)
Agent start: 2023-06-11 19:08:45.735 UTC (1686510525735)
Pid: 4671
Go Version: go1.19.9
Python Version: 3.8.16
Build arch: amd64
Agent flavor: agent
Check Runners: 4
Log Level: info
Paths
=====
Config File: /etc/datadog-agent/datadog.yaml
conf.d: /etc/datadog-agent/conf.d
checks.d: /etc/datadog-agent/checks.d
Clocks
======
NTP offset: 52.850731s
System time: 2023-06-11 19:15:26.58 UTC (1686510926580)
Host Info
=========
bootTime: 2021-10-19 16:39:37 UTC (1634661577000)
hostId: 4a1fa099-0726-4010-92eb-a6c5169d705f
kernelArch: x86_64
kernelVersion: 3.10.0-1160.11.1.el7.x86_64
os: linux
platform: centos
platformFamily: rhel
platformVersion: 7.9.2009
procs: 460
uptime: 14402h29m9s
Hostnames
=========
ec2-hostname: ip-10-100-10-17.us-east-2.compute.internal
host_aliases: [i-XXXXXXXXXXXXXXXXX]
hostname: server1
instance-id: i-XXXXXXXXXXXXXXXXX
socket-fqdn: server1.local.
socket-hostname: server1
hostname provider: os
unused hostname providers:
'hostname' configuration/environment: hostname is empty
'hostname_file' configuration/environment: 'hostname_file' configuration is not enabled
aws: not retrieving hostname from AWS: the host is not an ECS instance and other providers already retrieve non-default hostnames
azure: azure_hostname_style is set to 'os'
container: the agent is not containerized
fargate: agent is not runnning on Fargate
fqdn: 'hostname_fqdn' configuration is not enabled
gce: unable to retrieve hostname from GCE: GCE metadata API error: status code 404 trying to GET http://169.254.169.254/computeMetadata/v1/instance/hostname
Metadata
========
agent_version: 7.45.0
cloud_provider: AWS
config_apm_dd_url:
config_dd_url:
config_logs_dd_url:
config_logs_socks5_proxy_address:
config_no_proxy: [169.254.169.254 100.100.100.200]
config_process_dd_url:
config_proxy_http:
config_proxy_https:
config_site:
feature_apm_enabled: true
feature_cspm_enabled: false
feature_cws_enabled: false
feature_dynamic_instrumentation_enabled: false
feature_enable_http_stats_by_status_code: false
feature_logs_enabled: true
feature_networks_enabled: false
feature_networks_http_enabled: false
feature_networks_https_enabled: false
feature_otlp_enabled: false
feature_process_enabled: true
feature_processes_container_enabled: true
feature_remote_configuration_enabled: false
feature_usm_go_tls_enabled: false
feature_usm_http2_enabled: false
feature_usm_java_tls_enabled: false
feature_usm_kafka_enabled: false
flavor: agent
hostname_source: os
install_method_installer_version: datadog_formula-3.5
install_method_tool: saltstack
install_method_tool_version: saltstack-3005
logs_transport: HTTP
=========
Collector
=========
Running Checks
==============
consul (2.2.0)
--------------
Instance ID: consul:dd70a33f647dcf20 [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/consul.d/conf.yaml
Total Runs: 27
Metric Samples: Last Run: 1, Total: 27
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 2, Total: 56
Average Execution Time : 11ms
Last Execution Date : 2023-06-11 19:15:17 UTC (1686510917000)
Last Successful Execution Date : 2023-06-11 19:15:17 UTC (1686510917000)
metadata:
version.major: 1
version.minor: 9
version.patch: 1
version.raw: 1.9.1
version.scheme: semver
cpu
---
Instance ID: cpu [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/cpu.d/conf.yaml.default
Total Runs: 27
Metric Samples: Last Run: 9, Total: 236
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 0s
Last Execution Date : 2023-06-11 19:15:24 UTC (1686510924000)
Last Successful Execution Date : 2023-06-11 19:15:24 UTC (1686510924000)
disk (4.9.0)
------------
Instance ID: disk:67cc0574430a16ba [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/disk.d/conf.yaml.default
Total Runs: 26
Metric Samples: Last Run: 276, Total: 7,176
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 30ms
Last Execution Date : 2023-06-11 19:15:16 UTC (1686510916000)
Last Successful Execution Date : 2023-06-11 19:15:16 UTC (1686510916000)
file_handle
-----------
Instance ID: file_handle [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/file_handle.d/conf.yaml.default
Total Runs: 27
Metric Samples: Last Run: 5, Total: 135
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 0s
Last Execution Date : 2023-06-11 19:15:23 UTC (1686510923000)
Last Successful Execution Date : 2023-06-11 19:15:23 UTC (1686510923000)
io
--
Instance ID: io [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/io.d/conf.yaml.default
Total Runs: 26
Metric Samples: Last Run: 197, Total: 4,987
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 0s
Last Execution Date : 2023-06-11 19:15:15 UTC (1686510915000)
Last Successful Execution Date : 2023-06-11 19:15:15 UTC (1686510915000)
load
----
Instance ID: load [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/load.d/conf.yaml.default
Total Runs: 27
Metric Samples: Last Run: 6, Total: 162
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 0s
Last Execution Date : 2023-06-11 19:15:22 UTC (1686510922000)
Last Successful Execution Date : 2023-06-11 19:15:22 UTC (1686510922000)
memory
------
Instance ID: memory [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/memory.d/conf.yaml.default
Total Runs: 26
Metric Samples: Last Run: 20, Total: 520
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 0s
Last Execution Date : 2023-06-11 19:15:14 UTC (1686510914000)
Last Successful Execution Date : 2023-06-11 19:15:14 UTC (1686510914000)
network (2.9.4)
---------------
Instance ID: network:4b0649b7e11f0772 [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/network.d/conf.yaml.default
Total Runs: 27
Metric Samples: Last Run: 81, Total: 2,187
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 2ms
Last Execution Date : 2023-06-11 19:15:21 UTC (1686510921000)
Last Successful Execution Date : 2023-06-11 19:15:21 UTC (1686510921000)
ntp
---
Instance ID: ntp:3c427a42a70bbf8 [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/ntp.d/conf.yaml.default
Total Runs: 1
Metric Samples: Last Run: 1, Total: 1
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 1, Total: 1
Average Execution Time : 0s
Last Execution Date : 2023-06-11 19:08:47 UTC (1686510527000)
Last Successful Execution Date : 2023-06-11 19:08:47 UTC (1686510527000)
uptime
------
Instance ID: uptime [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/uptime.d/conf.yaml.default
Total Runs: 26
Metric Samples: Last Run: 1, Total: 26
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 0, Total: 0
Average Execution Time : 0s
Last Execution Date : 2023-06-11 19:15:13 UTC (1686510913000)
Last Successful Execution Date : 2023-06-11 19:15:13 UTC (1686510913000)
========
JMXFetch
========
Information
==================
Initialized checks
==================
no checks
Failed checks
=============
no checks
=========
Forwarder
=========
Transactions
============
Cluster: 0
ClusterRole: 0
ClusterRoleBinding: 0
CronJob: 0
CustomResource: 0
CustomResourceDefinition: 0
DaemonSet: 0
Deployment: 0
Dropped: 0
HighPriorityQueueFull: 0
Ingress: 0
Job: 0
Namespace: 0
Node: 0
OrchestratorManifest: 0
PersistentVolume: 0
PersistentVolumeClaim: 0
Pod: 0
ReplicaSet: 0
Requeued: 0
Retried: 0
RetryQueueSize: 0
Role: 0
RoleBinding: 0
Service: 0
ServiceAccount: 0
StatefulSet: 0
VerticalPodAutoscaler: 0
Transaction Successes
=====================
Total number: 56
Successes By Endpoint:
check_run_v1: 26
intake: 3
metadata_v1: 1
series_v2: 26
On-disk storage
===============
On-disk storage is disabled. Configure `forwarder_storage_max_size_in_bytes` to enable it.
API Keys status
===============
API key ending with d7289: API Key valid
==========
Endpoints
==========
https://app.datadoghq.com - API Key ending with:
- d7289
==========
Logs Agent
==========
Reliable: Sending compressed logs in HTTPS to agent-http-intake.logs.datadoghq.com on port 443
BytesSent: 357
EncodedBytesSent: 282
LogsProcessed: 1
LogsSent: 1
CoreAgentProcessOpenFiles: 24
OSFileLimit: 4096
consul
------
- Type: file
Path: /var/log/consul/*.log
Service: consul
Source: consul
Status: OK
1 files tailed out of 1 files matching
Inputs:
/var/log/consul/consul-1686503969895345311.log
Bytes Read: 160
Pipeline Latency:
Average Latency (ms): 0
24h Average Latency (ms): 0
Peak Latency (ms): 0
24h Peak Latency (ms): 0
=============
Process Agent
=============
Version: 7.45.0
Status date: 2023-06-11 19:15:26.893 UTC (1686510926893)
Process Agent Start: 2023-06-11 19:08:45.856 UTC (1686510525856)
Pid: 4673
Go Version: go1.19.9
Build arch: amd64
Log Level: info
Enabled Checks: [process rtprocess]
Allocated Memory: 20,552,296 bytes
Hostname: server1
System Probe Process Module Status: Not running
=================
Process Endpoints
=================
https://process.datadoghq.com - API Key ending with:
- d7289
=========
Collector
=========
Last collection time: 2023-06-11 19:15:17
Docker socket:
Number of processes: 325
Number of containers: 0
Process Queue length: 0
RTProcess Queue length: 0
Connections Queue length: 0
Event Queue length: 0
Pod Queue length: 0
Process Bytes enqueued: 0
RTProcess Bytes enqueued: 0
Connections Bytes enqueued: 0
Event Bytes enqueued: 0
Pod Bytes enqueued: 0
Drop Check Payloads: []
=========
APM Agent
=========
Status: Running
Pid: 4674
Uptime: 401 seconds
Mem alloc: 9,314,488 bytes
Hostname: server1
Receiver: localhost:8126
Endpoints:
https://trace.agent.datadoghq.com
Receiver (previous minute)
==========================
No traces received in the previous minute.
Writer (previous minute)
========================
Traces: 0 payloads, 0 traces, 0 events, 0 bytes
Stats: 0 payloads, 0 stats buckets, 0 bytes
==========
Aggregator
==========
Checks Metric Sample: 15,963
Dogstatsd Metric Sample: 4,253
Event: 1
Events Flushed: 1
Number Of Flushes: 26
Series Flushed: 14,020
Service Check: 297
Service Checks Flushed: 315
=========
DogStatsD
=========
Event Packets: 0
Event Parse Errors: 0
Metric Packets: 4,252
Metric Parse Errors: 0
Service Check Packets: 0
Service Check Parse Errors: 0
Udp Bytes: 376,444
Udp Packet Reading Errors: 0
Udp Packets: 2,470
Uds Bytes: 0
Uds Origin Detection Errors: 0
Uds Packet Reading Errors: 0
Uds Packets: 0
Unterminated Metric Errors: 0
====
OTLP
====
Status: Not enabled
Collector status: Not running
```
Additional environment details (Operating System, Cloud provider, etc):
This is reproduced on a CentOS 7 box in AWS with a Consul cluster. The actual behavior described is likely not specific to any of those details.
Steps to reproduce the issue:
Point a check at an HTTPS server that (a) is not on port 443 and (b) does not have an SSL certificate that immediately validates when using system-default SSL CAs
In this case, the check is the consul check, which runs on https against port 8501, using consul's internal CA and automatic certificate distribution
POSSIBLY OPTIONAL: specify a (not useful) client certificate
In this case, the client certificate is set in the conf.d/consul.d/conf.yaml with tls_cert set to the consul service mesh CA certificate (this was an error of configuration, but the configuration error itself was not the cause of the issue; see the additional details)
Restart the agent and observe logs
Describe the results you received:
The datadog agent log contains the error:
2023-06-11 18:21:08 UTC | CORE | ERROR | (pkg/collector/python/datadog_agent.go:123 in LogMessage) | consul:7cfafeeca0bbeaaa | (http.py:464) | Error occurred while connecting to socket to discover intermediate certificates: [Errno 111] Connection refused
which, upon investigation, appears to be because fetch_intermediate_certs always connects to port 443:
In a situation where fetching intermediate certificates would have been effective but the server in question is on a nonstandard HTTPS port, this would fail.
Describe the results you expected:
No Error occurred while connecting to socket should show up in the logs when that is not relevant to the situation.
In this situation, it looks to me like fetch_intermediate_certs should support an optional port, and at least its immediate use in make_request_aia_chasing in the same file:
Additional information you deem important (e.g. issue happens only occasionally):
This bug was discovered due to a config file typo; we had intended to specify tls_ca_cert but had specified tls_cert instead. This config file typo is only relevant inasmuch as it revealed the bug; when we changed the config file to specify tls_ca_cert everything functions as expected. However, it was more confusing to track down our configuration issue, since the errors were
ssl.SSLError: [SSL] PEM lib (_ssl.c:4067)
```text
2023-06-11 18:21:08 UTC | CORE | ERROR | (pkg/collector/python/datadog_agent.go:123 in LogMessage) | consul:7cfafeeca0bbeaaa | (consul.py:154) | Consul request to https://localhost:8501/v1/agent/self failed
Traceback (most recent call last):
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/connectionpool.py", line 703, in urlopen
httplib_response = self._make_request(
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/connectionpool.py", line 386, in _make_request
self._validate_conn(conn)
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/connectionpool.py", line 1042, in _validate_conn
conn.connect()
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/connection.py", line 414, in connect
self.sock = ssl_wrap_socket(
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/util/ssl_.py", line 418, in ssl_wrap_socket
context.load_cert_chain(certfile, keyfile)
ssl.SSLError: [SSL] PEM lib (_ssl.c:4067)
```
During handling of the above exception, another exception occurred:
```text
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/adapters.py", line 489, in send
resp = conn.urlopen(
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/connectionpool.py", line 787, in urlopen
retries = retries.increment(
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/util/retry.py", line 592, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='localhost', port=8501): Max retries exceeded with url: /v1/agent/self (Caused by SSLError(SSLError(9, '[SSL] PEM lib (_ssl.c:4067)')))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/consul/consul.py", line 141, in consul_request
resp = self.http.get(url)
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/utils/http.py", line 355, in get
return self._request('get', url, options)
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/utils/http.py", line 419, in _request
response = self.make_request_aia_chasing(request_method, method, url, new_options, persist)
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/utils/http.py", line 432, in make_request_aia_chasing
raise e
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/utils/http.py", line 425, in make_request_aia_chasing
response = request_method(url, **new_options)
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/api.py", line 73, in get
return request("get", url, params=params, **kwargs)
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/api.py", line 59, in request
return session.request(method=method, url=url, **kwargs)
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/sessions.py", line 587, in request
resp = self.send(prep, **send_kwargs)
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/sessions.py", line 701, in send
r = adapter.send(request, **kwargs)
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/adapters.py", line 563, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='localhost', port=8501): Max retries exceeded with url: /v1/agent/self (Caused by SSLError(SSLError(9, '[SSL] PEM lib (_ssl.c:4067)')))```
```
as opposed to the much more transparent error message if tls_cert is not specified:
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1131)
```text
2023-06-11 19:26:37 UTC | CORE | ERROR | (pkg/collector/python/datadog_agent.go:130 in LogMessage) | consul:c4992031b651f3c8 | (http.py:464) | Error occurred while connecting to socket to discover intermediate certificates: [Errno 111] Connection refused
2023-06-11 19:26:37 UTC | CORE | ERROR | (pkg/collector/python/datadog_agent.go:130 in LogMessage) | consul:c4992031b651f3c8 | (consul.py:154) | Consul request to https://127.0.0.1:8501/v1/agent/self failed
Traceback (most recent call last):
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/connectionpool.py", line 714, in urlopen
httplib_response = self._make_request(
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/connectionpool.py", line 403, in _make_request
self._validate_conn(conn)
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/connectionpool.py", line 1053, in _validate_conn
conn.connect()
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/connection.py", line 419, in connect
self.sock = ssl_wrap_socket(
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/util/ssl_.py", line 453, in ssl_wrap_socket
ssl_sock = _ssl_wrap_socket_impl(sock, context, tls_in_tls)
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/util/ssl_.py", line 495, in _ssl_wrap_socket_impl
return ssl_context.wrap_socket(sock)
File "/opt/datadog-agent/embedded/lib/python3.8/ssl.py", line 500, in wrap_socket
return self.sslsocket_class._create(
File "/opt/datadog-agent/embedded/lib/python3.8/ssl.py", line 1040, in _create
self.do_handshake()
File "/opt/datadog-agent/embedded/lib/python3.8/ssl.py", line 1309, in do_handshake
self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1131)
```
During handling of the above exception, another exception occurred:
```text
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/adapters.py", line 489, in send
resp = conn.urlopen(
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/connectionpool.py", line 798, in urlopen
retries = retries.increment(
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/util/retry.py", line 592, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='127.0.0.1', port=8501): Max retries exceeded with url: /v1/agent/self (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed:
unable to get local issuer certificate (_ssl.c:1131)')))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/consul/consul.py", line 141, in consul_request
resp = self.http.get(url)
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/utils/http.py", line 355, in get
return self._request('get', url, options)
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/utils/http.py", line 419, in _request
response = self.make_request_aia_chasing(request_method, method, url, new_options, persist)
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/utils/http.py", line 432, in make_request_aia_chasing
raise e
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/utils/http.py", line 425, in make_request_aia_chasing
response = request_method(url, **new_options)
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/api.py", line 73, in get
return request("get", url, params=params, **kwargs)
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/api.py", line 59, in request
return session.request(method=method, url=url, **kwargs)
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/sessions.py", line 587, in request
resp = self.send(prep, **send_kwargs)
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/sessions.py", line 701, in send
r = adapter.send(request, **kwargs)
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/adapters.py", line 563, in send
raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='127.0.0.1', port=8501): Max retries exceeded with url: /v1/agent/self (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: una
ble to get local issuer certificate (_ssl.c:1131)')))
```
Output of the info page
`sudo datadog-agent status` output
```text Getting the status from the agent. =============== Agent (v7.45.0) =============== Status date: 2023-06-11 19:15:26.58 UTC (1686510926580) Agent start: 2023-06-11 19:08:45.735 UTC (1686510525735) Pid: 4671 Go Version: go1.19.9 Python Version: 3.8.16 Build arch: amd64 Agent flavor: agent Check Runners: 4 Log Level: info Paths ===== Config File: /etc/datadog-agent/datadog.yaml conf.d: /etc/datadog-agent/conf.d checks.d: /etc/datadog-agent/checks.d Clocks ====== NTP offset: 52.850731s System time: 2023-06-11 19:15:26.58 UTC (1686510926580) Host Info ========= bootTime: 2021-10-19 16:39:37 UTC (1634661577000) hostId: 4a1fa099-0726-4010-92eb-a6c5169d705f kernelArch: x86_64 kernelVersion: 3.10.0-1160.11.1.el7.x86_64 os: linux platform: centos platformFamily: rhel platformVersion: 7.9.2009 procs: 460 uptime: 14402h29m9s Hostnames ========= ec2-hostname: ip-10-100-10-17.us-east-2.compute.internal host_aliases: [i-XXXXXXXXXXXXXXXXX] hostname: server1 instance-id: i-XXXXXXXXXXXXXXXXX socket-fqdn: server1.local. socket-hostname: server1 hostname provider: os unused hostname providers: 'hostname' configuration/environment: hostname is empty 'hostname_file' configuration/environment: 'hostname_file' configuration is not enabled aws: not retrieving hostname from AWS: the host is not an ECS instance and other providers already retrieve non-default hostnames azure: azure_hostname_style is set to 'os' container: the agent is not containerized fargate: agent is not runnning on Fargate fqdn: 'hostname_fqdn' configuration is not enabled gce: unable to retrieve hostname from GCE: GCE metadata API error: status code 404 trying to GET http://169.254.169.254/computeMetadata/v1/instance/hostname Metadata ======== agent_version: 7.45.0 cloud_provider: AWS config_apm_dd_url: config_dd_url: config_logs_dd_url: config_logs_socks5_proxy_address: config_no_proxy: [169.254.169.254 100.100.100.200] config_process_dd_url: config_proxy_http: config_proxy_https: config_site: feature_apm_enabled: true feature_cspm_enabled: false feature_cws_enabled: false feature_dynamic_instrumentation_enabled: false feature_enable_http_stats_by_status_code: false feature_logs_enabled: true feature_networks_enabled: false feature_networks_http_enabled: false feature_networks_https_enabled: false feature_otlp_enabled: false feature_process_enabled: true feature_processes_container_enabled: true feature_remote_configuration_enabled: false feature_usm_go_tls_enabled: false feature_usm_http2_enabled: false feature_usm_java_tls_enabled: false feature_usm_kafka_enabled: false flavor: agent hostname_source: os install_method_installer_version: datadog_formula-3.5 install_method_tool: saltstack install_method_tool_version: saltstack-3005 logs_transport: HTTP ========= Collector ========= Running Checks ============== consul (2.2.0) -------------- Instance ID: consul:dd70a33f647dcf20 [OK] Configuration Source: file:/etc/datadog-agent/conf.d/consul.d/conf.yaml Total Runs: 27 Metric Samples: Last Run: 1, Total: 27 Events: Last Run: 0, Total: 0 Service Checks: Last Run: 2, Total: 56 Average Execution Time : 11ms Last Execution Date : 2023-06-11 19:15:17 UTC (1686510917000) Last Successful Execution Date : 2023-06-11 19:15:17 UTC (1686510917000) metadata: version.major: 1 version.minor: 9 version.patch: 1 version.raw: 1.9.1 version.scheme: semver cpu --- Instance ID: cpu [OK] Configuration Source: file:/etc/datadog-agent/conf.d/cpu.d/conf.yaml.default Total Runs: 27 Metric Samples: Last Run: 9, Total: 236 Events: Last Run: 0, Total: 0 Service Checks: Last Run: 0, Total: 0 Average Execution Time : 0s Last Execution Date : 2023-06-11 19:15:24 UTC (1686510924000) Last Successful Execution Date : 2023-06-11 19:15:24 UTC (1686510924000) disk (4.9.0) ------------ Instance ID: disk:67cc0574430a16ba [OK] Configuration Source: file:/etc/datadog-agent/conf.d/disk.d/conf.yaml.default Total Runs: 26 Metric Samples: Last Run: 276, Total: 7,176 Events: Last Run: 0, Total: 0 Service Checks: Last Run: 0, Total: 0 Average Execution Time : 30ms Last Execution Date : 2023-06-11 19:15:16 UTC (1686510916000) Last Successful Execution Date : 2023-06-11 19:15:16 UTC (1686510916000) file_handle ----------- Instance ID: file_handle [OK] Configuration Source: file:/etc/datadog-agent/conf.d/file_handle.d/conf.yaml.default Total Runs: 27 Metric Samples: Last Run: 5, Total: 135 Events: Last Run: 0, Total: 0 Service Checks: Last Run: 0, Total: 0 Average Execution Time : 0s Last Execution Date : 2023-06-11 19:15:23 UTC (1686510923000) Last Successful Execution Date : 2023-06-11 19:15:23 UTC (1686510923000) io -- Instance ID: io [OK] Configuration Source: file:/etc/datadog-agent/conf.d/io.d/conf.yaml.default Total Runs: 26 Metric Samples: Last Run: 197, Total: 4,987 Events: Last Run: 0, Total: 0 Service Checks: Last Run: 0, Total: 0 Average Execution Time : 0s Last Execution Date : 2023-06-11 19:15:15 UTC (1686510915000) Last Successful Execution Date : 2023-06-11 19:15:15 UTC (1686510915000) load ---- Instance ID: load [OK] Configuration Source: file:/etc/datadog-agent/conf.d/load.d/conf.yaml.default Total Runs: 27 Metric Samples: Last Run: 6, Total: 162 Events: Last Run: 0, Total: 0 Service Checks: Last Run: 0, Total: 0 Average Execution Time : 0s Last Execution Date : 2023-06-11 19:15:22 UTC (1686510922000) Last Successful Execution Date : 2023-06-11 19:15:22 UTC (1686510922000) memory ------ Instance ID: memory [OK] Configuration Source: file:/etc/datadog-agent/conf.d/memory.d/conf.yaml.default Total Runs: 26 Metric Samples: Last Run: 20, Total: 520 Events: Last Run: 0, Total: 0 Service Checks: Last Run: 0, Total: 0 Average Execution Time : 0s Last Execution Date : 2023-06-11 19:15:14 UTC (1686510914000) Last Successful Execution Date : 2023-06-11 19:15:14 UTC (1686510914000) network (2.9.4) --------------- Instance ID: network:4b0649b7e11f0772 [OK] Configuration Source: file:/etc/datadog-agent/conf.d/network.d/conf.yaml.default Total Runs: 27 Metric Samples: Last Run: 81, Total: 2,187 Events: Last Run: 0, Total: 0 Service Checks: Last Run: 0, Total: 0 Average Execution Time : 2ms Last Execution Date : 2023-06-11 19:15:21 UTC (1686510921000) Last Successful Execution Date : 2023-06-11 19:15:21 UTC (1686510921000) ntp --- Instance ID: ntp:3c427a42a70bbf8 [OK] Configuration Source: file:/etc/datadog-agent/conf.d/ntp.d/conf.yaml.default Total Runs: 1 Metric Samples: Last Run: 1, Total: 1 Events: Last Run: 0, Total: 0 Service Checks: Last Run: 1, Total: 1 Average Execution Time : 0s Last Execution Date : 2023-06-11 19:08:47 UTC (1686510527000) Last Successful Execution Date : 2023-06-11 19:08:47 UTC (1686510527000) uptime ------ Instance ID: uptime [OK] Configuration Source: file:/etc/datadog-agent/conf.d/uptime.d/conf.yaml.default Total Runs: 26 Metric Samples: Last Run: 1, Total: 26 Events: Last Run: 0, Total: 0 Service Checks: Last Run: 0, Total: 0 Average Execution Time : 0s Last Execution Date : 2023-06-11 19:15:13 UTC (1686510913000) Last Successful Execution Date : 2023-06-11 19:15:13 UTC (1686510913000) ======== JMXFetch ======== Information ================== Initialized checks ================== no checks Failed checks ============= no checks ========= Forwarder ========= Transactions ============ Cluster: 0 ClusterRole: 0 ClusterRoleBinding: 0 CronJob: 0 CustomResource: 0 CustomResourceDefinition: 0 DaemonSet: 0 Deployment: 0 Dropped: 0 HighPriorityQueueFull: 0 Ingress: 0 Job: 0 Namespace: 0 Node: 0 OrchestratorManifest: 0 PersistentVolume: 0 PersistentVolumeClaim: 0 Pod: 0 ReplicaSet: 0 Requeued: 0 Retried: 0 RetryQueueSize: 0 Role: 0 RoleBinding: 0 Service: 0 ServiceAccount: 0 StatefulSet: 0 VerticalPodAutoscaler: 0 Transaction Successes ===================== Total number: 56 Successes By Endpoint: check_run_v1: 26 intake: 3 metadata_v1: 1 series_v2: 26 On-disk storage =============== On-disk storage is disabled. Configure `forwarder_storage_max_size_in_bytes` to enable it. API Keys status =============== API key ending with d7289: API Key valid ========== Endpoints ========== https://app.datadoghq.com - API Key ending with: - d7289 ========== Logs Agent ========== Reliable: Sending compressed logs in HTTPS to agent-http-intake.logs.datadoghq.com on port 443 BytesSent: 357 EncodedBytesSent: 282 LogsProcessed: 1 LogsSent: 1 CoreAgentProcessOpenFiles: 24 OSFileLimit: 4096 consul ------ - Type: file Path: /var/log/consul/*.log Service: consul Source: consul Status: OK 1 files tailed out of 1 files matching Inputs: /var/log/consul/consul-1686503969895345311.log Bytes Read: 160 Pipeline Latency: Average Latency (ms): 0 24h Average Latency (ms): 0 Peak Latency (ms): 0 24h Peak Latency (ms): 0 ============= Process Agent ============= Version: 7.45.0 Status date: 2023-06-11 19:15:26.893 UTC (1686510926893) Process Agent Start: 2023-06-11 19:08:45.856 UTC (1686510525856) Pid: 4673 Go Version: go1.19.9 Build arch: amd64 Log Level: info Enabled Checks: [process rtprocess] Allocated Memory: 20,552,296 bytes Hostname: server1 System Probe Process Module Status: Not running ================= Process Endpoints ================= https://process.datadoghq.com - API Key ending with: - d7289 ========= Collector ========= Last collection time: 2023-06-11 19:15:17 Docker socket: Number of processes: 325 Number of containers: 0 Process Queue length: 0 RTProcess Queue length: 0 Connections Queue length: 0 Event Queue length: 0 Pod Queue length: 0 Process Bytes enqueued: 0 RTProcess Bytes enqueued: 0 Connections Bytes enqueued: 0 Event Bytes enqueued: 0 Pod Bytes enqueued: 0 Drop Check Payloads: [] ========= APM Agent ========= Status: Running Pid: 4674 Uptime: 401 seconds Mem alloc: 9,314,488 bytes Hostname: server1 Receiver: localhost:8126 Endpoints: https://trace.agent.datadoghq.com Receiver (previous minute) ========================== No traces received in the previous minute. Writer (previous minute) ======================== Traces: 0 payloads, 0 traces, 0 events, 0 bytes Stats: 0 payloads, 0 stats buckets, 0 bytes ========== Aggregator ========== Checks Metric Sample: 15,963 Dogstatsd Metric Sample: 4,253 Event: 1 Events Flushed: 1 Number Of Flushes: 26 Series Flushed: 14,020 Service Check: 297 Service Checks Flushed: 315 ========= DogStatsD ========= Event Packets: 0 Event Parse Errors: 0 Metric Packets: 4,252 Metric Parse Errors: 0 Service Check Packets: 0 Service Check Parse Errors: 0 Udp Bytes: 376,444 Udp Packet Reading Errors: 0 Udp Packets: 2,470 Uds Bytes: 0 Uds Origin Detection Errors: 0 Uds Packet Reading Errors: 0 Uds Packets: 0 Unterminated Metric Errors: 0 ==== OTLP ==== Status: Not enabled Collector status: Not running ```Additional environment details (Operating System, Cloud provider, etc): This is reproduced on a CentOS 7 box in AWS with a Consul cluster. The actual behavior described is likely not specific to any of those details.
Steps to reproduce the issue:
conf.d/consul.d/conf.yaml
withtls_cert
set to the consul service mesh CA certificate (this was an error of configuration, but the configuration error itself was not the cause of the issue; see the additional details)Describe the results you received:
The datadog agent log contains the error:
which, upon investigation, appears to be because
fetch_intermediate_certs
always connects to port 443:7.45.x currently at 044247efccff3bcdf0ae19b5481879c151f87814 https://github.com/DataDog/integrations-core/blob/044247efccff3bcdf0ae19b5481879c151f87814/datadog_checks_base/datadog_checks/base/utils/http.py#L457-L466
In a situation where fetching intermediate certificates would have been effective but the server in question is on a nonstandard HTTPS port, this would fail.
Describe the results you expected:
No
Error occurred while connecting to socket
should show up in the logs when that is not relevant to the situation.In this situation, it looks to me like
fetch_intermediate_certs
should support an optional port, and at least its immediate use inmake_request_aia_chasing
in the same file:7.45.x currently at 044247efccff3bcdf0ae19b5481879c151f87814 https://github.com/DataDog/integrations-core/blob/044247efccff3bcdf0ae19b5481879c151f87814/datadog_checks_base/datadog_checks/base/utils/http.py#L423-L430
should pass in the port for the URL in question.
Additional information you deem important (e.g. issue happens only occasionally):
This bug was discovered due to a config file typo; we had intended to specify
tls_ca_cert
but had specifiedtls_cert
instead. This config file typo is only relevant inasmuch as it revealed the bug; when we changed the config file to specifytls_ca_cert
everything functions as expected. However, it was more confusing to track down our configuration issue, since the errors weressl.SSLError: [SSL] PEM lib (_ssl.c:4067)
```text 2023-06-11 18:21:08 UTC | CORE | ERROR | (pkg/collector/python/datadog_agent.go:123 in LogMessage) | consul:7cfafeeca0bbeaaa | (consul.py:154) | Consul request to https://localhost:8501/v1/agent/self failed Traceback (most recent call last): File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/connectionpool.py", line 703, in urlopen httplib_response = self._make_request( File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/connectionpool.py", line 386, in _make_request self._validate_conn(conn) File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/connectionpool.py", line 1042, in _validate_conn conn.connect() File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/connection.py", line 414, in connect self.sock = ssl_wrap_socket( File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/util/ssl_.py", line 418, in ssl_wrap_socket context.load_cert_chain(certfile, keyfile) ssl.SSLError: [SSL] PEM lib (_ssl.c:4067) ```During handling of the above exception, another exception occurred:
```text During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/adapters.py", line 489, in send resp = conn.urlopen( File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/connectionpool.py", line 787, in urlopen retries = retries.increment( File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/util/retry.py", line 592, in increment raise MaxRetryError(_pool, url, error or ResponseError(cause)) urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='localhost', port=8501): Max retries exceeded with url: /v1/agent/self (Caused by SSLError(SSLError(9, '[SSL] PEM lib (_ssl.c:4067)'))) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/consul/consul.py", line 141, in consul_request resp = self.http.get(url) File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/utils/http.py", line 355, in get return self._request('get', url, options) File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/utils/http.py", line 419, in _request response = self.make_request_aia_chasing(request_method, method, url, new_options, persist) File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/utils/http.py", line 432, in make_request_aia_chasing raise e File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/utils/http.py", line 425, in make_request_aia_chasing response = request_method(url, **new_options) File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/api.py", line 73, in get return request("get", url, params=params, **kwargs) File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/api.py", line 59, in request return session.request(method=method, url=url, **kwargs) File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/sessions.py", line 587, in request resp = self.send(prep, **send_kwargs) File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/sessions.py", line 701, in send r = adapter.send(request, **kwargs) File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/adapters.py", line 563, in send raise SSLError(e, request=request) requests.exceptions.SSLError: HTTPSConnectionPool(host='localhost', port=8501): Max retries exceeded with url: /v1/agent/self (Caused by SSLError(SSLError(9, '[SSL] PEM lib (_ssl.c:4067)')))``` ```as opposed to the much more transparent error message if
tls_cert
is not specified:ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1131)
```text 2023-06-11 19:26:37 UTC | CORE | ERROR | (pkg/collector/python/datadog_agent.go:130 in LogMessage) | consul:c4992031b651f3c8 | (http.py:464) | Error occurred while connecting to socket to discover intermediate certificates: [Errno 111] Connection refused 2023-06-11 19:26:37 UTC | CORE | ERROR | (pkg/collector/python/datadog_agent.go:130 in LogMessage) | consul:c4992031b651f3c8 | (consul.py:154) | Consul request to https://127.0.0.1:8501/v1/agent/self failed Traceback (most recent call last): File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/connectionpool.py", line 714, in urlopen httplib_response = self._make_request( File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/connectionpool.py", line 403, in _make_request self._validate_conn(conn) File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/connectionpool.py", line 1053, in _validate_conn conn.connect() File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/connection.py", line 419, in connect self.sock = ssl_wrap_socket( File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/util/ssl_.py", line 453, in ssl_wrap_socket ssl_sock = _ssl_wrap_socket_impl(sock, context, tls_in_tls) File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/util/ssl_.py", line 495, in _ssl_wrap_socket_impl return ssl_context.wrap_socket(sock) File "/opt/datadog-agent/embedded/lib/python3.8/ssl.py", line 500, in wrap_socket return self.sslsocket_class._create( File "/opt/datadog-agent/embedded/lib/python3.8/ssl.py", line 1040, in _create self.do_handshake() File "/opt/datadog-agent/embedded/lib/python3.8/ssl.py", line 1309, in do_handshake self._sslobj.do_handshake() ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1131) ```During handling of the above exception, another exception occurred:
```text During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/adapters.py", line 489, in send resp = conn.urlopen( File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/connectionpool.py", line 798, in urlopen retries = retries.increment( File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/util/retry.py", line 592, in increment raise MaxRetryError(_pool, url, error or ResponseError(cause)) urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='127.0.0.1', port=8501): Max retries exceeded with url: /v1/agent/self (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1131)'))) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/consul/consul.py", line 141, in consul_request resp = self.http.get(url) File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/utils/http.py", line 355, in get return self._request('get', url, options) File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/utils/http.py", line 419, in _request response = self.make_request_aia_chasing(request_method, method, url, new_options, persist) File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/utils/http.py", line 432, in make_request_aia_chasing raise e File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/utils/http.py", line 425, in make_request_aia_chasing response = request_method(url, **new_options) File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/api.py", line 73, in get return request("get", url, params=params, **kwargs) File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/api.py", line 59, in request return session.request(method=method, url=url, **kwargs) File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/sessions.py", line 587, in request resp = self.send(prep, **send_kwargs) File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/sessions.py", line 701, in send r = adapter.send(request, **kwargs) File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/adapters.py", line 563, in send raise SSLError(e, request=request) requests.exceptions.SSLError: HTTPSConnectionPool(host='127.0.0.1', port=8501): Max retries exceeded with url: /v1/agent/self (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: una ble to get local issuer certificate (_ssl.c:1131)'))) ```