Hi, thanks for taking a look! I believe there's an RBAC issue here and would appreciate any advice.
Output of the info page (if this is a bug)
===============
Agent (v7.33.0)
===============
Status date: 2022-02-10 18:23:16.534 UTC (1644517396534)
Agent start: 2022-02-10 00:11:36.865 UTC (1644451896865)
Pid: 1
Go Version: go1.16.7
Python Version: 3.8.11
Build arch: amd64
Agent flavor: agent
Check Runners: 4
Log Level: INFO
Paths
=====
Config File: /etc/datadog-agent/datadog.yaml
conf.d: /etc/datadog-agent/conf.d
checks.d: /etc/datadog-agent/checks.d
Clocks
======
System time: 2022-02-10 18:23:16.534 UTC (1644517396534)
Host Info
=========
bootTime: 2022-02-09 19:00:12 UTC (1644433212000)
kernelArch: x86_64
kernelVersion: 3.10.0-1160.15.2.el7.mcp.x86_64
os: linux
platform: ubuntu
platformFamily: debian
platformVersion: 21.10
procs: 701
uptime: 5h11m38s
Hostnames
=========
host_aliases: REDACTED
hostname: REDACTED
socket-fqdn: REDACTED
socket-hostname: REDACTED
host tags:
kube_node_role:control-plane
kube_node_role:master
hostname provider: container
Metadata
========
agent_version: 7.33.0
config_apm_dd_url:
config_dd_url:
config_logs_dd_url:
config_logs_socks5_proxy_address:
config_no_proxy: []
config_process_dd_url:
config_proxy_http: http://10.109.142.145:9881
config_proxy_https: http://10.109.142.145:9881
config_site:
feature_apm_enabled: false
feature_cspm_enabled: false
feature_cws_enabled: false
feature_logs_enabled: true
feature_networks_enabled: false
feature_process_enabled: false
flavor: agent
hostname_source: container
install_method_installer_version: datadog-2.30.5
install_method_tool: helm
install_method_tool_version: Helm
logs_transport: HTTP
=========
Collector
=========
Running Checks
==============
kubelet (7.1.0)
---------------
Instance ID: kubelet:5bbc63f3938c02f4 [ERROR]
Configuration Source: file:/etc/datadog-agent/conf.d/kubelet.d/conf.yaml.default
Total Runs: 3,275
Metric Samples: Last Run: 0, Total: 0
Events: Last Run: 0, Total: 0
Service Checks: Last Run: 1, Total: 3,275
Average Execution Time : 743ms
Last Execution Date : 2022-02-10 18:23:09 UTC (1644517389000)
Last Successful Execution Date : Never
Error: HTTPSConnectionPool(host='172.16.128.1', port=10250): Max retries exceeded with url: /spec?verbose=True (Caused by ProxyError('Cannot connect to proxy.', OSError('Tunnel connection failed: 403 Forbidden')))
Traceback (most recent call last):
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/connectionpool.py", line 700, in urlopen
self._prepare_proxy(conn)
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/connectionpool.py", line 994, in _prepare_proxy
conn.connect()
File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/connection.py", line 371, in connect
self._tunnel()
File "/opt/datadog-agent/embedded/lib/python3.8/http/client.py", line 905, in _tunnel
raise OSError("Tunnel connection failed: %d %s" % (code,
OSError: Tunnel connection failed: 403 Forbidden
Describe what happened:
I've installed Datadog on a single node k3s cluster via Helm and am trying to get Datadog fully operational, but am getting a 403 error when trying to connect to kubelet. I've been seeing these three warnings/errors repeatedly when looking at the logs of the agent container. Looks like a similar issue to https://github.com/DataDog/datadog-agent/issues/6621.
2022-02-10 19:25:08 UTC | CORE | WARN | (pkg/collector/python/datadog_agent.go:124 in LogMessage) | - | (kubelet.py:425) | kubelet check https://172.16.128.1:10250/healthz failed: HTTPSConnectionPool(host='172.16.128.1', port=10250): Max retries exceeded with url: /healthz?verbose=True (Caused by ProxyError('Cannot connect to proxy.', OSError('Tunnel connection failed: 403 Forbidden')))
2022-02-10 19:25:08 UTC | CORE | WARN | (pkg/collector/python/datadog_agent.go:124 in LogMessage) | - | (base.py:59) | failed to retrieve pod list from the kubelet at https://172.16.128.1:10250/pods : HTTPSConnectionPool(host='172.16.128.1', port=10250): Max retries exceeded with url: /pods?verbose=True (Caused by ProxyError('Cannot connect to proxy.', OSError('Tunnel connection failed: 403 Forbidden')))
2022-02-10 19:25:08 UTC | CORE | ERROR | (pkg/collector/worker/check_logger.go:68 in Error) | check:kubelet | Error running check: [{"message": "HTTPSConnectionPool(host='172.16.128.1', port=10250): Max retries exceeded with url: /spec?verbose=True (Caused by ProxyError('Cannot connect to proxy.', OSError('Tunnel connection failed: 403 Forbidden')))", "traceback": "Traceback (most recent call last):\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/connectionpool.py\", line 700, in urlopen\n self._prepare_proxy(conn)\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/connectionpool.py\", line 994, in _prepare_proxy\n conn.connect()\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/connection.py\", line 371, in connect\n self._tunnel()\n File \"/opt/datadog-agent/embedded/lib/python3.8/http/client.py\", line 905, in _tunnel\n raise OSError(\"Tunnel connection failed: %d %s\" % (code,\nOSError: Tunnel connection failed: 403 Forbidden\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/adapters.py\", line 439, in send\n resp = conn.urlopen(\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/connectionpool.py\", line 785, in urlopen\n retries = retries.increment(\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/urllib3/util/retry.py\", line 592, in increment\n raise MaxRetryError(_pool, url, error or ResponseError(cause))\nurllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='172.16.128.1', port=10250): Max retries exceeded with url: /spec?verbose=True (Caused by ProxyError('Cannot connect to proxy.', OSError('Tunnel connection failed: 403 Forbidden')))\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/checks/base.py\", line 1017, in run\n self.check(instance)\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/kubelet/kubelet.py\", line 336, in check\n self._report_node_metrics(self.instance_tags)\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/kubelet/kubelet.py\", line 382, in _report_node_metrics\n node_resp = self._retrieve_node_spec()\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/kubelet/kubelet.py\", line 365, in _retrieve_node_spec\n node_resp = self.perform_kubelet_query(self.node_spec_url)\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/checks/kubelet_base/base.py\", line 31, in perform_kubelet_query\n return self.http.get(\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/utils/http.py\", line 341, in get\n return self._request('get', url, options)\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/utils/http.py\", line 405, in _request\n response = self.make_request_aia_chasing(request_method, method, url, new_options, persist)\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/utils/http.py\", line 411, in make_request_aia_chasing\n response = request_method(url, **new_options)\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/api.py\", line 76, in get\n return request('get', url, params=params, **kwargs)\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/api.py\", line 61, in request\n return session.request(method=method, url=url, **kwargs)\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/sessions.py\", line 542, in request\n resp = self.send(prep, **send_kwargs)\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/sessions.py\", line 655, in send\n r = adapter.send(request, **kwargs)\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/requests/adapters.py\", line 510, in send\n raise ProxyError(e, request=request)\nrequests.exceptions.ProxyError: HTTPSConnectionPool(host='172.16.128.1', port=10250): Max retries exceeded with url: /spec?verbose=True (Caused by ProxyError('Cannot connect to proxy.', OSError('Tunnel connection failed: 403 Forbidden')))\n"}]
Execing into the agent container and running
TOKEN=$(</var/run/secrets/kubernetes.io/serviceaccount/token) && curl https://$DD_KUBERNETES_KUBELET_HOST:10250/pods -v -k -H "Authorization: Bearer $TOKEN"
and
TOKEN=$(</var/run/secrets/kubernetes.io/serviceaccount/token) && curl https://$DD_KUBERNETES_KUBELET_HOST:10250/pods -v --cacert /var/run/secrets/kubernetes.io/serviceaccount/ca.crt -H "Authorization: Bearer $TOKEN"
return no errors, so I think this is an RBAC issue.
Describe what you expected:
I expected the pod to run without errors and be able to reach the kubelet.
Steps to reproduce the issue:
Attached is the ClusterRole and ClusterRoleBinding.
Hi, thanks for taking a look! I believe there's an RBAC issue here and would appreciate any advice.
Output of the info page (if this is a bug)
Describe what happened: I've installed Datadog on a single node k3s cluster via Helm and am trying to get Datadog fully operational, but am getting a 403 error when trying to connect to kubelet. I've been seeing these three warnings/errors repeatedly when looking at the logs of the agent container. Looks like a similar issue to https://github.com/DataDog/datadog-agent/issues/6621.
Execing into the agent container and running
TOKEN=$(</var/run/secrets/kubernetes.io/serviceaccount/token) && curl https://$DD_KUBERNETES_KUBELET_HOST:10250/pods -v -k -H "Authorization: Bearer $TOKEN"
andTOKEN=$(</var/run/secrets/kubernetes.io/serviceaccount/token) && curl https://$DD_KUBERNETES_KUBELET_HOST:10250/pods -v --cacert /var/run/secrets/kubernetes.io/serviceaccount/ca.crt -H "Authorization: Bearer $TOKEN"
return no errors, so I think this is an RBAC issue.Describe what you expected: I expected the pod to run without errors and be able to reach the kubelet.
Steps to reproduce the issue: Attached is the ClusterRole and ClusterRoleBinding.
Additional environment details (Operating System, Cloud provider, etc): k3s version v1.21.6+k3s1 (df033fa2)