DataDog / helm-charts

Helm charts for Datadog products
Apache License 2.0
329 stars 1.01k forks source link

Cannot mount additional volumes to agent when deploying to GKE Autopilot #1286

Open alexstojda opened 6 months ago

alexstojda commented 6 months ago

Describe what happened:

When mounting additional volumes to the agent container, the daemon set cannot be deployed to the cluster, error shown below.

We need these mounted files to be able to open a mTLS connecting to our CloudSQL instance and collect metrics.

Error:

Error: UPGRADE FAILED: cannot patch "datadog" with kind DaemonSet: admission webhook "warden-validating.common-webhooks.networking.gke.io" denied the request: GKE Warden rejected the request because it violates one or more constraints.
Violations details: {"[denied by autogke-no-host-port]":["container trace-agent specifies host ports [8126], which are disallowed in Autopilot."],"[denied by autogke-no-write-mode-hostpath]":["hostPath volume pointerdir in container agent is accessed in write mode; disallowed in Autopilot.","hostPath volume runtimesocketdir used in container agent uses path /var/run/containerd which is not allowed in Autopilot. Allowed path prefixes for hostPath volumes are: [/var/log/].","hostPath volume procdir used in container agent uses path /proc which is not allowed in Autopilot. Allowed path prefixes for hostPath volumes are: [/var/log/].","hostPath volume cgroups used in container agent uses path /sys/fs/cgroup which is not allowed in Autopilot. Allowed path prefixes for hostPath volumes are: [/var/log/].","hostPath volume logdockercontainerpath used in container agent uses path /var/lib/docker/containers which is not allowed in Autopilot. Allowed path prefixes for hostPath volumes are: [/var/log/].","hostPath volume runtimesocketdir used in container trace-agent uses path /var/run/containerd which is not allowed in Autopilot. Allowed path prefixes for hostPath volumes are: [/var/log/].","hostPath volume runtimesocketdir used in container process-agent uses path /var/run/containerd which is not allowed in Autopilot. Allowed path prefixes for hostPath volumes are: [/var/log/].","hostPath volume cgroups used in container process-agent uses path /sys/fs/cgroup which is not allowed in Autopilot. Allowed path prefixes for hostPath volumes are: [/var/log/].","hostPath volume passwd used in container process-agent uses path /etc/passwd which is not allowed in Autopilot. Allowed path prefixes for hostPath volumes are: [/var/log/].","hostPath volume procdir used in container process-agent uses path /proc which is not allowed in Autopilot. Allowed path prefixes for hostPath volumes are: [/var/log/].","hostPath volume procdir used in container init-config uses path /proc which is not allowed in Autopilot. Allowed path prefixes for hostPath volumes are: [/var/log/].","hostPath volume runtimesocketdir used in container init-config uses path /var/run/containerd which is not allowed in Autopilot. Allowed path prefixes for hostPath volumes are: [/var/log/]."]}

Describe what you expected:

Daemonset to deploy without issue, with the secret mounted to the agent container.

Steps to reproduce the issue:

Helm values extract with the volume mounts:

agent:
  volumeMounts:
  - mountPath: /var/cloudsql-ssl-certs
    name: cloudsql-ssl-certs
    readOnly: true
 volumes:
 - name: cloudsql-ssl-certs
   secret:
     secretName: cloudsql-ssl-certificates

Additional environment details (Operating System, Cloud provider, etc):

GKE Autopilot 1.26.8-gke.200 Datadog Helm: 3.50.2

fanny-jiang commented 6 months ago

Hi @alexstojda, thanks for opening this issue. Unfortunately, mounting additional volumes to the agent daemonset is restricted when deploying on GKE Autopilot. The Datadog AllowlistedV2Workload does not permit additional volume mountPaths that are not listed in the allowlist.

In the meantime, I'll reach out to Google about allowlisting a volumeMount for ssl certificates.

bingli22 commented 2 months ago

Hi, we came across the similar issue. We are trying to mount prometheus.d/conf.yaml to the agent daemonset but got the following error: Failed sync attempt to ff798a9f2c244143e71791c658bae1cba4cbcb99: one or more objects failed to apply, reason: error when patching "/dev/shm/2324329685": admission webhook "warden-validating.common-webhooks.networking.gke.io" denied the request: GKE Warden rejected the request because it violates one or more constraints. Violations details: {"[denied by autogke-no-host-port]":["container trace-agent specifies host ports [8126], which are disallowed in Autopilot."],"[denied by autogke-no-write-mode-hostpath]":["hostPath volume pointerdir in container agent is accessed in write mode; disallowed in Autopilot.","hostPath volume runtimesocketdir used in container agent uses path /var/run/containerd which is not allowed in Autopilot.

Helm values.yaml: agents: revisionHistoryLimit: 0 priorityClassName: gmp-critical volumes:

Basically we want to increase the max_returned_metrics from 2,000 to 10k since it's hitting the limit. Is there a workaround to achieve this? Thank you.