Closed LQss11 closed 1 month ago
Not only on k8s, also on a RHEL 9 system here....
Hi @LQss11 and @rodehoed,
We prepared a fix and we are going to release it very soon in 7.57.1.
To expand on the issue a bit: The crash seems to be related to a lack of permissions on the trace-agent process (likely due to the explicit securityContext
restrictions from the configuration). This is combined with a new feature from 7.57 that defines a default UDS listener on /var/run/datadog/apm.socket
, and seeing that the directory exists, the trace-agent startup process attempts to create the listener (which fails). This failure is what led to the crash.
The fix we have put together on https://github.com/DataDog/datadog-agent/pull/29218 will make sure we don't crash on these circumstances, and just log the error, while continuing the agent startup.
Thanks @ichinaski / @FlorentClarret for the responses! I resolved the issue by updating the Helm chart with:
agents:
env:
# This works
- name: DD_APM_RECEIVER_SOCKET
value: "unix:///var/run/datadog/apm.socket"
# This does not work
# - name: DD_APM_RECEIVER_SOCKET
# value: "/var/run/datadog/apm.socket"
Even though /var/run/datadog/apm.socket
is the default, specifying it without the unix://
prefix caused issues.
I also faced issues related to the new log launcher feature, which uses a JSON file under /opt/...
. To fix it, I used different versions:
DD_ADMISSION_CONTROLLER_AUTO_INSTRUMENTATION_INIT_SECURITY_CONTEXT
)For future CI stability, I recommend testing with unprivileged user IDs. I’ve raised issue #29286.
Hello @LQss11 and @rodehoed,
We just released Agent 7.57.1 with a fix for this issue.
Closing this issue given the fix is now released.
Description
After upgrading to Datadog Agent 7.57.0 from 7.56.2, the trace-agent fails to start due to a permission error with the UDS listener, despite having datadog.apm.socketEnabled set to false.
Configuration
Here is the relevant portion of my
values.yaml
configuration:Error Message
The
trace-agent
logs show the following error:Steps to Reproduce
trace-agent
fails to start with a permission denied error.Additional Information
DD_ADMISSION_CONTROLLER_AUTO_INSTRUMENTATION_INIT_SECURITY_CONTEXT
Request
Could you please help in diagnosing this issue or provide guidance on how to resolve the permission issue with the UDS listener for the
trace-agent
?