DataDog / datadog-agent

Main repository for Datadog Agent
https://docs.datadoghq.com/
Apache License 2.0
2.89k stars 1.21k forks source link

How to run DataDog as non-root in plain Kubernetes #6442

Open yashbhutwala opened 4 years ago

yashbhutwala commented 4 years ago

What to set the securityContext to in order to run DataDog as non-root in plain Kubernetes?

I tried the following variations:

securityContext: 
    runAsNonRoot: true
    runAsUser: 101
    runAsGroup: 0
securityContext: 
    runAsNonRoot: true
    runAsUser: 101
    fsGroup: 0
securityContext: 
    runAsNonRoot: true

and all resulted in the following errors:

2020-09-23 20:54:44 UTC | CORE | WARN | (pkg/collector/corechecks/checkbase.go:165 in Warnf) | Error initialising check: temporary failure in dockerutil, will retry later: try delay not elapsed yet
2020-09-23 20:54:44 UTC | CORE | ERROR | (pkg/collector/runner/runner.go:292 in work) | Error running check docker: temporary failure in dockerutil, will retry later: try delay not elapsed yet
2020-09-23 20:54:45 UTC | CORE | ERROR | (pkg/autodiscovery/config_poller.go:123 in collect) | Unable to collect configurations from provider docker: temporary failure in dockerutil, will retry later: try delay not elapsed yet
2020-09-23 18:05:29 UTC | CORE | WARN | (pkg/logs/input/file/scanner.go:217 in startNewTailer) | open /var/log/pods/tdsscaoff4_vault-68954dbcc9-7mzzl_fff8cd7a-b904-46e4-b332-f05d9f745db7/vault/0.log: permission denied
2020-09-23 18:05:29 UTC | CORE | WARN | (pkg/logs/input/file/scanner.go:217 in startNewTailer) | open /var/log/pods/autocc3452_eventstore-0_1e5d20a5-0b13-49fb-8855-ee7bd68a1250/eventstore/0.log: permission denied
2020-09-23 18:05:29 UTC | CORE | WARN | (pkg/logs/input/file/scanner.go:217 in startNewTailer) | open /var/log/pods/cisca_triage-query-75464d9bc6-ncg9t_16957d80-e287-4d78-a2a5-95a8d9378e7a/triage-query/437.log: permission denied

I do see the dd-agent configured in Dockerfile here, which is why I chose uid 101, but I do not see a USER directive in the Dockerfile.

khewonc commented 4 years ago

Hi @yashbhutwala, I see that you've opened up a ticket with our support team. Our team is investigating and will reach back out on the ticket as soon as we have more information. We'll also update this github issue with the findings from support's investigation for other users that run into the same issue.

yashbhutwala commented 4 years ago

Awesome, thanks @khewonc in advance!

yashbhutwala commented 4 years ago

Here is the latest info from DataDog's containers team for others finding their way here.

"We have confirmed that it is not possible at the moment to use Kubernetes log collection when using a non-root user. Mainly, this is due to the strict permissions on the docker folders. Moving forward, we have two options to fix this issue:

1) Changing to a root user 2) Or you will have to switch the log collection method if you wish to run the agent as a non-root. You can either could Docker log collection or getting the logs through journald."

SIVA451 commented 3 years ago

2021-01-19 09:27:20 UTC | CORE | ERROR | (pkg/autodiscovery/config_poller.go:123 in collect) | Unable to collect configurations from provider docker: temporary failure in dockerutil, will retry later: try delaynot elapsed yet 2021-01-19 09:27:21 UTC | CORE | ERROR | (pkg/autodiscovery/config_poller.go:123 in collect) | Unable to collect configurations from provider docker: temporary failure in dockerutil, will retry later: try delaynot elapsed yet 2021-01-19 09:27:22 UTC | CORE | ERROR | (pkg/autodiscovery/config_poller.go:123 in collect) | Unable to collect configurations from provider docker: temporary failure in dockerutil, will retry later: try delaynot elapsed yet

I have the above logs from my container and integration is failing. I am using an AWS Private link for the traffic routing. Looking for help from @yashbhutwala and DD team

joshk132 commented 3 years ago

I am getting this error message as well when using the below to force using root to get around this. Any ideas on why the error message would exist when using

securityContext:
          runAsUser: 0
kevin-lindsay-1 commented 3 years ago

@yashbhutwala I would like documentation for your second option, because running as root is something I would like to avoid unless it's 100% necessary.

Additionally, there are a number of containers being deployed in helm, are all containers as part of the datadog/datadog chart required to run as root?

fngyjx commented 1 year ago

Any update on this?

namevic commented 1 year ago

I found the way that worked for me

datadog:
  securityContext:
    runAsUser: 101

if you use an APM add

  apm:
    enabled: true
    socketEnabled: false
    portEnabled: true
mohamedoabbiit commented 9 months ago

Any workaround for this when the datadog agent running as a secondary container on the AWS ECS

ipleten commented 7 months ago

I found the way that worked for me

datadog:
  securityContext:
    runAsUser: 101

if you use an APM add

  apm:
    enabled: true
    socketEnabled: false
    portEnabled: true

Hello, Does logging work as well?