DataDog / helm-charts

Helm charts for Datadog products
Apache License 2.0
339 stars 1.01k forks source link

kubeStateMetricsCore does not report anything #973

Open PSanetra opened 1 year ago

PSanetra commented 1 year ago

Describe what happened: We enabled datadog.kubeStateMetricsCore.enabled and disabled datadog.kubeStateMetricsEnabled. After this change no kubernetes_state.* metrics were reported anymore. I saw that kubernetes_state_core.yaml.default was indeed mounted in /conf.d.

Describe what you expected: I expected datadog.kubeStateMetricsCore.enabled would report kubernetes_state.* metrics from Kubernetes to Datadog.

Steps to reproduce the issue: See above

Additional environment details (Operating System, Cloud provider, etc): OS: Linux Cloud provider: AKS Helm Chart Version: 3.23.0 Agent Version: 7.43.1

CharlyF commented 1 year ago

Hi @PSanetra - Thanks for opening this issue. Could you share the render of the chart? One thing I am interested in is seeing that KSM Core is running in the Cluster Agent (and maybe as a Cluster Level Check in Cluster Check Workers).

PSanetra commented 1 year ago

Hi @CharlyF, sure here is the rendered chart with redacted secrets:

---
# Source: datadog/templates/agent-clusterchecks-rbac.yaml
apiVersion: v1
kind: ServiceAccount
automountServiceAccountToken: true
metadata:
  labels:
    helm.sh/chart: 'datadog-3.23.0'
    app.kubernetes.io/name: "datadog"
    app.kubernetes.io/instance: "datadog"
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/version: "7"
    app: "datadog"
    chart: "datadog-3.23.0"
    heritage: "Helm"
    release: "datadog"
  name: datadog-cluster-checks
  namespace: datadog
---
# Source: datadog/templates/cluster-agent-rbac.yaml
apiVersion: v1
kind: ServiceAccount
automountServiceAccountToken: true
metadata:
  labels:
    app: "datadog"
    chart: "datadog-3.23.0"
    heritage: "Helm"
    release: "datadog"
    helm.sh/chart: 'datadog-3.23.0'
    app.kubernetes.io/name: "datadog"
    app.kubernetes.io/instance: "datadog"
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/version: "7"
  name: datadog-cluster-agent
  namespace: datadog
---
# Source: datadog/templates/rbac.yaml
apiVersion: v1
kind: ServiceAccount
automountServiceAccountToken: true
metadata:
  name: datadog
  namespace: datadog
  labels:
    helm.sh/chart: 'datadog-3.23.0'
    app.kubernetes.io/name: "datadog"
    app.kubernetes.io/instance: "datadog"
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/version: "7" # end range $role := .Values.datadog.secretBackend.roles
---
# Source: datadog/templates/secret-api-key.yaml
apiVersion: v1
kind: Secret
metadata:
  name: datadog
  namespace: datadog
  labels:
    helm.sh/chart: 'datadog-3.23.0'
    app.kubernetes.io/name: "datadog"
    app.kubernetes.io/instance: "datadog"
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/version: "7"
type: Opaque
data:
  api-key: "__REDACTED__"
---
# Source: datadog/templates/secret-application-key.yaml
apiVersion: v1
kind: Secret
metadata:
  name: "datadog-appkey"
  namespace: datadog
  labels:
    helm.sh/chart: 'datadog-3.23.0'
    app.kubernetes.io/name: "datadog"
    app.kubernetes.io/instance: "datadog"
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/version: "7"
type: Opaque
data:
  app-key: "__REDACTED__"
---
# Source: datadog/templates/secret-cluster-agent-token.yaml
apiVersion: v1
kind: Secret
metadata:
  name: datadog-cluster-agent
  namespace: datadog
  labels:
    helm.sh/chart: 'datadog-3.23.0'
    app.kubernetes.io/name: "datadog"
    app.kubernetes.io/instance: "datadog"
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/version: "7"
type: Opaque
data:
  token: "__REDACTED__"
---
# Source: datadog/templates/cluster-agent-confd-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: datadog-cluster-agent-confd
  namespace: datadog
  labels:
    helm.sh/chart: 'datadog-3.23.0'
    app.kubernetes.io/name: "datadog"
    app.kubernetes.io/instance: "datadog"
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/version: "7"
  annotations:
    checksum/confd-config: __REDACTED__
data:
  kubernetes_state_core.yaml.default: |-
    init_config:
    instances:
      - collectors:
        - secrets
        - nodes
        - pods
        - services
        - resourcequotas
        - replicationcontrollers
        - limitranges
        - persistentvolumeclaims
        - persistentvolumes
        - namespaces
        - endpoints
        - daemonsets
        - deployments
        - replicasets
        - statefulsets
        - cronjobs
        - jobs
        - horizontalpodautoscalers
        - poddisruptionbudgets
        - storageclasses
        - volumeattachments
        - ingresses
        labels_as_tags:
          {}
        annotations_as_tags:
          {}
---
# Source: datadog/templates/install_info-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: datadog-installinfo
  namespace: datadog
  labels:
    helm.sh/chart: 'datadog-3.23.0'
    app.kubernetes.io/name: "datadog"
    app.kubernetes.io/instance: "datadog"
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/version: "7"
  annotations:
    checksum/install_info: __REDACTED__
data:
  install_info: |
    ---
    install_method:
      tool: helm
      tool_version: Helm
      installer_version: datadog-3.23.0
---
# Source: datadog/charts/datadog-crds/templates/datadoghq.com_datadogmetrics_v1.yaml
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  annotations:
    controller-gen.kubebuilder.io/version: v0.4.1
  creationTimestamp: null
  name: datadogmetrics.datadoghq.com
  labels:
    helm.sh/chart: 'datadog-crds-0.4.7'
    app.kubernetes.io/managed-by: 'Helm'
    app.kubernetes.io/name: 'datadog-crds'
    app.kubernetes.io/instance: 'datadog'
spec:
  group: datadoghq.com
  names:
    kind: DatadogMetric
    listKind: DatadogMetricList
    plural: datadogmetrics
    singular: datadogmetric
  scope: Namespaced
  versions:
    - additionalPrinterColumns:
        - jsonPath: .status.conditions[?(@.type=='Active')].status
          name: active
          type: string
        - jsonPath: .status.conditions[?(@.type=='Valid')].status
          name: valid
          type: string
        - jsonPath: .status.currentValue
          name: value
          type: string
        - jsonPath: .status.autoscalerReferences
          name: references
          type: string
        - jsonPath: .status.conditions[?(@.type=='Updated')].lastUpdateTime
          name: update time
          type: date
      name: v1alpha1
      schema:
        openAPIV3Schema:
          description: DatadogMetric allows autoscaling on arbitrary Datadog query
          properties:
            apiVersion:
              description: 'APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
              type: string
            kind:
              description: 'Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
              type: string
            metadata:
              type: object
            spec:
              description: DatadogMetricSpec defines the desired state of DatadogMetric
              properties:
                externalMetricName:
                  description: ExternalMetricName is reserved for internal use
                  type: string
                maxAge:
                  description: MaxAge provides the max age for the metric query (overrides the default setting `external_metrics_provider.max_age`)
                  type: string
                query:
                  description: Query is the raw datadog query
                  type: string
              type: object
            status:
              description: DatadogMetricStatus defines the observed state of DatadogMetric
              properties:
                autoscalerReferences:
                  description: List of autoscalers currently using this DatadogMetric
                  type: string
                conditions:
                  description: Conditions Represents the latest available observations of a DatadogMetric's current state.
                  items:
                    description: DatadogMetricCondition describes the state of a DatadogMetric at a certain point.
                    properties:
                      lastTransitionTime:
                        description: Last time the condition transitioned from one status to another.
                        format: date-time
                        type: string
                      lastUpdateTime:
                        description: Last time the condition was updated.
                        format: date-time
                        type: string
                      message:
                        description: A human readable message indicating details about the transition.
                        type: string
                      reason:
                        description: The reason for the condition's last transition.
                        type: string
                      status:
                        description: Status of the condition, one of True, False, Unknown.
                        type: string
                      type:
                        description: Type of DatadogMetric condition.
                        type: string
                    required:
                      - status
                      - type
                    type: object
                  type: array
                  x-kubernetes-list-map-keys:
                    - type
                  x-kubernetes-list-type: map
                currentValue:
                  description: Value is the latest value of the metric
                  type: string
              required:
                - currentValue
              type: object
          type: object
      served: true
      storage: true
      subresources:
        status: {}
status:
  acceptedNames:
    kind: ""
    plural: ""
  conditions: []
  storedVersions: []
---
# Source: datadog/templates/cluster-agent-rbac.yaml
apiVersion: "rbac.authorization.k8s.io/v1"
kind: ClusterRole
metadata:
  labels:
    helm.sh/chart: 'datadog-3.23.0'
    app.kubernetes.io/name: "datadog"
    app.kubernetes.io/instance: "datadog"
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/version: "7"
  name: datadog-cluster-agent
rules:
- apiGroups:
  - ""
  resources:
  - services
  - endpoints
  - pods
  - nodes
  - namespaces
  - componentstatuses
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - ""
  resources:
  - events
  verbs:
  - get
  - list
  - watch
  - create
- apiGroups: ["quota.openshift.io"]
  resources:
  - clusterresourcequotas
  verbs:
  - get
  - list
- apiGroups:
  - "autoscaling"
  resources:
  - horizontalpodautoscalers
  verbs:
  - list
  - watch
- apiGroups:
  - ""
  resources:
  - configmaps
  resourceNames:
  - datadogtoken  # Kubernetes event collection state
  - datadogtoken  # Kept for backward compatibility with agent <7.37.0
  verbs:
  - get
  - update
- apiGroups:
  - ""
  resources:
  - configmaps
  resourceNames:
  - datadog-leader-election  # Leader election token
  - datadog-leader-election  # Kept for backward compatibility with agent <7.37.0
  - datadog-custom-metrics
  verbs:
  - get
  - update
- apiGroups:
  - ""
  resources:
  - configmaps
  resourceNames:
  - extension-apiserver-authentication
  verbs:
  - get
  - list
  - watch
- apiGroups:  # To create the leader election token and hpa events
  - ""
  resources:
  - configmaps
  - events
  verbs:
  - create
- nonResourceURLs:
  - "/version"
  - "/healthz"
  verbs:
  - get
- apiGroups:  # to get the kube-system namespace UID and generate a cluster ID
  - ""
  resources:
  - namespaces
  resourceNames:
  - "kube-system"
  verbs:
  - get
- apiGroups:  # To create the cluster-id configmap
  - ""
  resources:
  - configmaps
  resourceNames:
  - "datadog-cluster-id"
  verbs:
  - create
  - get
  - update
- apiGroups:
  - ""
  resources:
  - persistentvolumes
  - persistentvolumeclaims
  - serviceaccounts
  verbs:
  - list
  - get
  - watch
- apiGroups:
  - "apps"
  resources:
  - deployments
  - replicasets
  - daemonsets
  - statefulsets
  verbs:
  - list
  - get
  - watch
- apiGroups:
  - "batch"
  resources:
  - cronjobs
  - jobs
  verbs:
  - list
  - get
  - watch
- apiGroups:
  - networking.k8s.io
  resources:
  - ingresses
  verbs:
  - list
  - get
  - watch
- apiGroups:
  - "rbac.authorization.k8s.io"
  resources:
  - roles
  - rolebindings
  - clusterroles
  - clusterrolebindings
  verbs:
  - list
  - get
  - watch
- apiGroups:
  - autoscaling.k8s.io
  resources:
  - verticalpodautoscalers
  verbs:
  - list
  - get
  - watch
- apiGroups:
    - "apiextensions.k8s.io"
  resources:
    - customresourcedefinitions
  verbs:
    - list
    - get
    - watch
- apiGroups:
  - "datadoghq.com"
  resources:
  - "datadogmetrics"
  verbs:
  - "list"
  - "create"
  - "delete"
  - "watch"
- apiGroups:
  - "datadoghq.com"
  resources:
  - "datadogmetrics/status"
  verbs:
  - "update"
- apiGroups:
  - admissionregistration.k8s.io
  resources:
  - mutatingwebhookconfigurations
  verbs: ["get", "list", "watch", "update", "create"]
- apiGroups: ["batch"]
  resources: ["jobs", "cronjobs"]
  verbs: ["get"]
- apiGroups: ["apps"]
  resources: ["statefulsets", "replicasets", "deployments", "daemonsets"]
  verbs: ["get"]
- apiGroups:
  - policy
  resources:
  - podsecuritypolicies
  verbs:
  - use
  resourceNames:
  - datadog-cluster-agent
- apiGroups:
  - "security.openshift.io"
  resources:
  - securitycontextconstraints
  verbs:
  - use
  resourceNames:
  - datadog-cluster-agent
  - hostnetwork
---
# Source: datadog/templates/hpa-external-metrics-rbac.yaml
apiVersion: "rbac.authorization.k8s.io/v1"
kind: ClusterRole
metadata:
  labels:
    helm.sh/chart: 'datadog-3.23.0'
    app.kubernetes.io/name: "datadog"
    app.kubernetes.io/instance: "datadog"
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/version: "7"
  name: datadog-cluster-agent-external-metrics-reader
rules:
- apiGroups:
  - "external.metrics.k8s.io"
  resources:
  - "*"
  verbs:
  - list
  - get
  - watch
---
# Source: datadog/templates/kube-state-metrics-core-rbac.yaml
apiVersion: "rbac.authorization.k8s.io/v1"
kind: ClusterRole
metadata:
  labels:
    helm.sh/chart: 'datadog-3.23.0'
    app.kubernetes.io/name: "datadog"
    app.kubernetes.io/instance: "datadog"
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/version: "7"
  name: datadog-ksm-core
rules:
- apiGroups:
  - ""
  resources:
  - secrets
  - nodes
  - pods
  - services
  - resourcequotas
  - replicationcontrollers
  - limitranges
  - persistentvolumeclaims
  - persistentvolumes
  - namespaces
  - endpoints
  - events
  verbs:
  - list
  - watch
- apiGroups:
  - extensions
  resources:
  - daemonsets
  - deployments
  - replicasets
  verbs:
  - list
  - watch
- apiGroups:
  - apps
  resources:
  - statefulsets
  - daemonsets
  - deployments
  - replicasets
  verbs:
  - list
  - watch
- apiGroups:
  - batch
  resources:
  - cronjobs
  - jobs
  verbs:
  - list
  - watch
- apiGroups:
  - autoscaling
  resources:
  - horizontalpodautoscalers
  verbs:
  - list
  - watch
- apiGroups:
  - policy
  resources:
  - poddisruptionbudgets
  verbs:
  - list
  - watch
- apiGroups:
  - storage.k8s.io
  resources:
  - storageclasses
  - volumeattachments
  verbs:
  - list
  - watch    
- apiGroups:
  - networking.k8s.io
  resources:
  - ingresses
  verbs:
  - list
  - watch
---
# Source: datadog/templates/rbac.yaml
apiVersion: "rbac.authorization.k8s.io/v1"
kind: ClusterRole
metadata:
  name: datadog
  labels:
    helm.sh/chart: 'datadog-3.23.0'
    app.kubernetes.io/name: "datadog"
    app.kubernetes.io/instance: "datadog"
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/version: "7"
rules:
- nonResourceURLs:
  - "/metrics"
  verbs:
  - get
- apiGroups:  # Kubelet connectivity
  - ""
  resources:
  - nodes/metrics
  - nodes/spec
  - nodes/proxy
  - nodes/stats
  verbs:
  - get
- apiGroups:  # leader election check
  - ""
  resources:
  - endpoints
  verbs:
  - get
- apiGroups:
  - policy
  resources:
  - podsecuritypolicies
  verbs:
  - use
  resourceNames:
  - datadog
- apiGroups:
  - "security.openshift.io"
  resources:
  - securitycontextconstraints
  verbs:
  - use
  resourceNames:
  - datadog
  - hostaccess
  - privileged
- apiGroups:  # leader election check
  - "coordination.k8s.io"
  resources:
  - leases
  verbs:
  - get
---
# Source: datadog/templates/agent-clusterchecks-rbac.yaml
apiVersion: "rbac.authorization.k8s.io/v1"
kind: ClusterRoleBinding
metadata:
  labels:
    helm.sh/chart: 'datadog-3.23.0'
    app.kubernetes.io/name: "datadog"
    app.kubernetes.io/instance: "datadog"
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/version: "7"
  name: datadog-cluster-checks
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: datadog
subjects:
  - kind: ServiceAccount
    name: datadog-cluster-checks
    namespace: datadog
---
# Source: datadog/templates/cluster-agent-rbac.yaml
apiVersion: "rbac.authorization.k8s.io/v1"
kind: ClusterRoleBinding
metadata:
  labels:
    helm.sh/chart: 'datadog-3.23.0'
    app.kubernetes.io/name: "datadog"
    app.kubernetes.io/instance: "datadog"
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/version: "7"
  name: datadog-cluster-agent
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: datadog-cluster-agent
subjects:
  - kind: ServiceAccount
    name: datadog-cluster-agent
    namespace: datadog
---
# Source: datadog/templates/cluster-agent-rbac.yaml
apiVersion: "rbac.authorization.k8s.io/v1"
kind: ClusterRoleBinding
metadata:
  labels:
    app: "datadog"
    chart: "datadog-3.23.0"
    release: "datadog"
    heritage: "Helm"
    helm.sh/chart: 'datadog-3.23.0'
    app.kubernetes.io/name: "datadog"
    app.kubernetes.io/instance: "datadog"
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/version: "7"
  name: datadog-cluster-agent-system-auth-delegator
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:auth-delegator
subjects:
  - kind: ServiceAccount
    name: datadog-cluster-agent
    namespace: datadog
---
# Source: datadog/templates/hpa-external-metrics-rbac.yaml
apiVersion: "rbac.authorization.k8s.io/v1"
kind: ClusterRoleBinding
metadata:
  labels:
    helm.sh/chart: 'datadog-3.23.0'
    app.kubernetes.io/name: "datadog"
    app.kubernetes.io/instance: "datadog"
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/version: "7"
  name: datadog-cluster-agent-external-metrics-reader
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: datadog-cluster-agent-external-metrics-reader
subjects:
- kind: ServiceAccount
  name: horizontal-pod-autoscaler
  namespace: kube-system
---
# Source: datadog/templates/kube-state-metrics-core-rbac.yaml
apiVersion: "rbac.authorization.k8s.io/v1"
kind: ClusterRoleBinding
metadata:
  labels:
    helm.sh/chart: 'datadog-3.23.0'
    app.kubernetes.io/name: "datadog"
    app.kubernetes.io/instance: "datadog"
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/version: "7"
  name: datadog-ksm-core
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: datadog-ksm-core
subjects:
  - kind: ServiceAccount
    name: datadog-cluster-agent
    namespace: datadog
---
# Source: datadog/templates/rbac.yaml
apiVersion: "rbac.authorization.k8s.io/v1"
kind: ClusterRoleBinding
metadata:
  name: datadog
  labels:
    helm.sh/chart: 'datadog-3.23.0'
    app.kubernetes.io/name: "datadog"
    app.kubernetes.io/instance: "datadog"
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/version: "7"
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: datadog
subjects:
  - kind: ServiceAccount
    name: datadog
    namespace: datadog
---
# Source: datadog/templates/cluster-agent-rbac.yaml
apiVersion: "rbac.authorization.k8s.io/v1"
kind: Role
metadata:
  labels:
    helm.sh/chart: 'datadog-3.23.0'
    app.kubernetes.io/name: "datadog"
    app.kubernetes.io/instance: "datadog"
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/version: "7"
  name: datadog-cluster-agent-main
  namespace: datadog
rules:
- apiGroups: [""]
  resources: ["secrets"]
  verbs: ["get", "list", "watch", "update", "create"]
---
# Source: datadog/templates/dca-helm-values-rbac.yaml
apiVersion: "rbac.authorization.k8s.io/v1"
kind: Role
metadata:
  labels:
    helm.sh/chart: 'datadog-3.23.0'
    app.kubernetes.io/name: "datadog"
    app.kubernetes.io/instance: "datadog"
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/version: "7"
  name: datadog-dca-flare
  namespace: datadog
rules:
- apiGroups:
  - ""
  resources:
  - secrets
  - configmaps
  verbs:
  - get
  - list
---
# Source: datadog/templates/cluster-agent-rbac.yaml
apiVersion: "rbac.authorization.k8s.io/v1"
kind: RoleBinding
metadata:
  labels:
    helm.sh/chart: 'datadog-3.23.0'
    app.kubernetes.io/name: "datadog"
    app.kubernetes.io/instance: "datadog"
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/version: "7"
  name: "datadog-cluster-agent-main"
  namespace: datadog
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: datadog-cluster-agent-main
subjects:
  - kind: ServiceAccount
    name: datadog-cluster-agent
    namespace: datadog
---
# Source: datadog/templates/cluster-agent-rbac.yaml
apiVersion: "rbac.authorization.k8s.io/v1"
kind: RoleBinding
metadata:
  labels:
    helm.sh/chart: 'datadog-3.23.0'
    app.kubernetes.io/name: "datadog"
    app.kubernetes.io/instance: "datadog"
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/version: "7"
  name: "datadog-cluster-agent-apiserver"
  namespace: datadog
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: extension-apiserver-authentication-reader
subjects:
  - kind: ServiceAccount
    name: datadog-cluster-agent
    namespace: datadog
---
# Source: datadog/templates/dca-helm-values-rbac.yaml
apiVersion: "rbac.authorization.k8s.io/v1"
kind: RoleBinding
metadata:
  labels:
    helm.sh/chart: 'datadog-3.23.0'
    app.kubernetes.io/name: "datadog"
    app.kubernetes.io/instance: "datadog"
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/version: "7"
  name: datadog-dca-flare
  namespace: datadog
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: datadog-dca-flare
subjects:
  - kind: ServiceAccount
    name: datadog-cluster-agent
    namespace: datadog
---
# Source: datadog/templates/agent-services.yaml
apiVersion: v1
kind: Service
metadata:
  name: datadog-cluster-agent
  namespace: datadog
  labels:
    helm.sh/chart: 'datadog-3.23.0'
    app.kubernetes.io/name: "datadog"
    app.kubernetes.io/instance: "datadog"
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/version: "7"
spec:
  type: ClusterIP
  selector:
    app: datadog-cluster-agent
  ports:
  - port: 5005
    name: agentport
    protocol: TCP
---
# Source: datadog/templates/agent-services.yaml
apiVersion: v1
kind: Service
metadata:
  name: datadog-cluster-agent-metrics-api
  namespace: datadog
  labels:
    app: "datadog"
    chart: "datadog-3.23.0"
    release: "datadog"
    heritage: "Helm"
    helm.sh/chart: 'datadog-3.23.0'
    app.kubernetes.io/name: "datadog"
    app.kubernetes.io/instance: "datadog"
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/version: "7"
spec:
  type: ClusterIP
  selector:
    app: datadog-cluster-agent
  ports:
  - port: 8443
    name: metricsapi
    protocol: TCP
---
# Source: datadog/templates/agent-services.yaml
apiVersion: v1
kind: Service
metadata:
  name: datadog-cluster-agent-admission-controller
  namespace: datadog
  labels:
    app: "datadog"
    chart: "datadog-3.23.0"
    release: "datadog"
    heritage: "Helm"
    helm.sh/chart: 'datadog-3.23.0'
    app.kubernetes.io/name: "datadog"
    app.kubernetes.io/instance: "datadog"
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/version: "7"
spec:
  selector:
    app: datadog-cluster-agent
  ports:
  - port: 443
    targetPort: 8000
---
# Source: datadog/templates/agent-services.yaml
apiVersion: v1
kind: Service

metadata:
  name: datadog
  namespace: datadog
  labels:
    app: "datadog"
    chart: "datadog-3.23.0"
    release: "datadog"
    heritage: "Helm"
    helm.sh/chart: 'datadog-3.23.0'
    app.kubernetes.io/name: "datadog"
    app.kubernetes.io/instance: "datadog"
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/version: "7"
spec:
  selector:
    app: datadog
  ports:
    - protocol: UDP
      port: 8125
      targetPort: 8125
      name: dogstatsdport
  internalTrafficPolicy: Local
---
# Source: datadog/templates/daemonset.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: datadog
  namespace: datadog
  labels:
    helm.sh/chart: 'datadog-3.23.0'
    app.kubernetes.io/name: "datadog"
    app.kubernetes.io/instance: "datadog"
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/version: "7"
    app.kubernetes.io/component: agent

spec:
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: datadog
  template:
    metadata:
      labels:
        app.kubernetes.io/name: "datadog"
        app.kubernetes.io/instance: "datadog"
        app.kubernetes.io/managed-by: Helm
        app.kubernetes.io/component: agent
        app: datadog

      name: datadog
      annotations:
        checksum/clusteragent_token: __REDACTED__
        checksum/api_key: __REDACTED__
        checksum/install_info: __REDACTED__
        checksum/autoconf-config: __REDACTED__
        checksum/confd-config: __REDACTED__
        checksum/checksd-config: __REDACTED__
        ad.datadoghq.com/agent.logs: |
          [{"source": "agent","service": "agent","log_processing_rules": [{"type": "multi_line", "name": "log_start_with_special_character","pattern": "^[0-9\\-\\s:]+ [A-Z]{3} \\|"}]}]
    spec:

      securityContext:
        runAsUser: 0
      hostPID: true
      containers:
      - name: agent
        image: "gcr.io/datadoghq/agent:7.43.1-jmx"
        imagePullPolicy: IfNotPresent
        command: ["agent", "run"]

        resources:
          {}
        ports:
        - containerPort: 8125
          hostPort: 8125
          name: dogstatsdport
          protocol: UDP
        env:
          # Needs to be removed when Agent N-2 is built with Golang 1.17
          - name: GODEBUG
            value: x509ignoreCN=0
          - name: DD_API_KEY
            valueFrom:
              secretKeyRef:
                name: "datadog"
                key: api-key
          - name: DD_AUTH_TOKEN_FILE_PATH
            value: /etc/datadog-agent/auth/token

          - name: DD_CLUSTER_NAME
            value: "dev"
          - name: KUBERNETES
            value: "yes"
          - name: DD_SITE
            value: "datadoghq.eu"
          - name: DD_DD_URL
            value: "https://app.datadoghq.eu"
          - name: DD_KUBERNETES_KUBELET_HOST
            valueFrom:
              fieldRef:
                fieldPath: spec.nodeName
          - name: DD_KUBELET_CLIENT_CA
            value: /var/run/kubelet-ca/kubeletserver.crt
          - name: DD_ENV
            value: "dev"
          - name: DD_CLOUD_PROVIDER_METADATA
            value: "azure"
          - name: DD_CRI_SOCKET_PATH
            value: /host/var/run/containerd/containerd.sock

          - name: DD_LOG_LEVEL
            value: "INFO"
          - name: DD_DOGSTATSD_PORT
            value: "8125"
          - name: DD_DOGSTATSD_NON_LOCAL_TRAFFIC
            value: "true"
          - name: DD_CLUSTER_AGENT_ENABLED
            value: "true"
          - name: DD_CLUSTER_AGENT_KUBERNETES_SERVICE_NAME
            value: datadog-cluster-agent
          - name: DD_CLUSTER_AGENT_AUTH_TOKEN
            valueFrom:
              secretKeyRef:
                  name: datadog-cluster-agent
                  key: token

          - name: DD_APM_ENABLED
            value: "false"
          - name: DD_LOGS_ENABLED
            value: "true"
          - name: DD_LOGS_CONFIG_CONTAINER_COLLECT_ALL
            value: "true"
          - name: DD_LOGS_CONFIG_K8S_CONTAINER_USE_FILE
            value: "true"
          - name: DD_LOGS_CONFIG_AUTO_MULTI_LINE_DETECTION
            value: "false"
          - name: DD_HEALTH_PORT
            value: "5555"
          - name: DD_DOGSTATSD_SOCKET
            value: "/var/run/datadog/dsd.socket"
          - name: DD_EXTRA_CONFIG_PROVIDERS
            value: "endpointschecks"

          - name: DD_IGNORE_AUTOCONF
            value: "kubernetes_state"
          - name: DD_CHECKS_TAG_CARDINALITY
            value: "orchestrator"
          - name: DD_EXPVAR_PORT
            value: "6000"        
        volumeMounts:
          - name: installinfo
            subPath: install_info
            mountPath: /etc/datadog-agent/install_info
            readOnly: true
          - name: logdatadog
            mountPath: /var/log/datadog
            readOnly: false # Need RW to write logs
          - name: tmpdir
            mountPath: /tmp
            readOnly: false # Need RW to write to /tmp directory

          - name: os-release-file
            mountPath: /host/etc/os-release
            mountPropagation: None
            readOnly: true
          - name: config
            mountPath: /etc/datadog-agent
            readOnly: false # Need RW to mount to config path
          - name: auth-token
            mountPath: /etc/datadog-agent/auth
            readOnly: false # Need RW to write auth token

          - name: runtimesocketdir
            mountPath: /host/var/run/containerd
            mountPropagation: None
            readOnly: true

          - name: dsdsocket
            mountPath: /var/run/datadog
            readOnly: false
          - name: procdir
            mountPath: /host/proc
            mountPropagation: None
            readOnly: true
          - name: cgroups
            mountPath: /host/sys/fs/cgroup
            mountPropagation: None
            readOnly: true
          - name: pointerdir
            mountPath: /opt/datadog-agent/run
            mountPropagation: None
            readOnly: false # Need RW for logs pointer
          - name: logpodpath
            mountPath: /var/log/pods
            mountPropagation: None
            readOnly: true
          - name: logscontainerspath
            mountPath: /var/log/containers
            mountPropagation: None
            readOnly: true
          - name: kubelet-ca
            mountPath: /var/run/kubelet-ca/kubeletserver.crt
            readOnly: true
          - mountPath: /etc/jboss
            name: jboss-client
            readOnly: false
        livenessProbe:
          failureThreshold: 6
          httpGet:
            path: /live
            port: 5555
            scheme: HTTP
          initialDelaySeconds: 15
          periodSeconds: 15
          successThreshold: 1
          timeoutSeconds: 5
        readinessProbe:
          failureThreshold: 6
          httpGet:
            path: /ready
            port: 5555
            scheme: HTTP
          initialDelaySeconds: 15
          periodSeconds: 15
          successThreshold: 1
          timeoutSeconds: 5
      - name: trace-agent
        image: "gcr.io/datadoghq/agent:7.43.1-jmx"
        imagePullPolicy: IfNotPresent
        command: ["trace-agent", "-config=/etc/datadog-agent/datadog.yaml"]  
        resources:
          {}
        ports:
        - containerPort: 8126
          name: traceport
          protocol: TCP
        env:
          # Needs to be removed when Agent N-2 is built with Golang 1.17
          - name: GODEBUG
            value: x509ignoreCN=0
          - name: DD_API_KEY
            valueFrom:
              secretKeyRef:
                name: "datadog"
                key: api-key
          - name: DD_AUTH_TOKEN_FILE_PATH
            value: /etc/datadog-agent/auth/token

          - name: DD_CLUSTER_NAME
            value: "dev"
          - name: KUBERNETES
            value: "yes"
          - name: DD_SITE
            value: "datadoghq.eu"
          - name: DD_DD_URL
            value: "https://app.datadoghq.eu"
          - name: DD_KUBERNETES_KUBELET_HOST
            valueFrom:
              fieldRef:
                fieldPath: spec.nodeName
          - name: DD_KUBELET_CLIENT_CA
            value: /var/run/kubelet-ca/kubeletserver.crt
          - name: DD_ENV
            value: "dev"
          - name: DD_CLOUD_PROVIDER_METADATA
            value: "azure"
          - name: DD_CRI_SOCKET_PATH
            value: /host/var/run/containerd/containerd.sock

          - name: DD_CLUSTER_AGENT_ENABLED
            value: "true"
          - name: DD_CLUSTER_AGENT_KUBERNETES_SERVICE_NAME
            value: datadog-cluster-agent
          - name: DD_CLUSTER_AGENT_AUTH_TOKEN
            valueFrom:
              secretKeyRef:
                  name: datadog-cluster-agent
                  key: token

          - name: DD_LOG_LEVEL
            value: "INFO"
          - name: DD_APM_ENABLED
            value: "true"
          - name: DD_APM_NON_LOCAL_TRAFFIC
            value: "true"
          - name: DD_APM_RECEIVER_PORT
            value: "8126"
          - name: DD_APM_RECEIVER_SOCKET
            value: "/var/run/datadog/apm.socket"
          - name: DD_DOGSTATSD_SOCKET
            value: "/var/run/datadog/dsd.socket"        
        volumeMounts:
          - name: config
            mountPath: /etc/datadog-agent
            readOnly: true
          - name: auth-token
            mountPath: /etc/datadog-agent/auth
            readOnly: true
          - name: procdir
            mountPath: /host/proc
            mountPropagation: None
            readOnly: true
          - name: cgroups
            mountPath: /host/sys/fs/cgroup
            mountPropagation: None
            readOnly: true
          - name: logdatadog
            mountPath: /var/log/datadog
            readOnly: false # Need RW to write logs
          - name: tmpdir
            mountPath: /tmp
            readOnly: false # Need RW for tmp directory
          - name: dsdsocket
            mountPath: /var/run/datadog
            readOnly: false # Need RW for UDS DSD socket

          - name: runtimesocketdir
            mountPath: /host/var/run/containerd
            mountPropagation: None
            readOnly: true

          - name: kubelet-ca
            mountPath: /var/run/kubelet-ca/kubeletserver.crt
            readOnly: true
          - mountPath: /etc/jboss
            name: jboss-client
            readOnly: false
        livenessProbe:
          initialDelaySeconds: 15
          periodSeconds: 15
          tcpSocket:
            port: 8126
          timeoutSeconds: 5
      - name: process-agent
        image: "gcr.io/datadoghq/agent:7.43.1-jmx"
        imagePullPolicy: IfNotPresent
        command: ["process-agent", "--cfgpath=/etc/datadog-agent/datadog.yaml"]  
        resources:
          {}
        env:
          # Needs to be removed when Agent N-2 is built with Golang 1.17
          - name: GODEBUG
            value: x509ignoreCN=0
          - name: DD_API_KEY
            valueFrom:
              secretKeyRef:
                name: "datadog"
                key: api-key
          - name: DD_AUTH_TOKEN_FILE_PATH
            value: /etc/datadog-agent/auth/token

          - name: DD_CLUSTER_NAME
            value: "dev"
          - name: KUBERNETES
            value: "yes"
          - name: DD_SITE
            value: "datadoghq.eu"
          - name: DD_DD_URL
            value: "https://app.datadoghq.eu"
          - name: DD_KUBERNETES_KUBELET_HOST
            valueFrom:
              fieldRef:
                fieldPath: spec.nodeName
          - name: DD_KUBELET_CLIENT_CA
            value: /var/run/kubelet-ca/kubeletserver.crt
          - name: DD_ENV
            value: "dev"
          - name: DD_CLOUD_PROVIDER_METADATA
            value: "azure"
          - name: DD_CRI_SOCKET_PATH
            value: /host/var/run/containerd/containerd.sock

          - name: DD_CLUSTER_AGENT_ENABLED
            value: "true"
          - name: DD_CLUSTER_AGENT_KUBERNETES_SERVICE_NAME
            value: datadog-cluster-agent
          - name: DD_CLUSTER_AGENT_AUTH_TOKEN
            valueFrom:
              secretKeyRef:
                  name: datadog-cluster-agent
                  key: token

          - name: DD_PROCESS_AGENT_DISCOVERY_ENABLED
            value: "true"
          - name: DD_LOG_LEVEL
            value: "INFO"
          - name: DD_SYSTEM_PROBE_ENABLED
            value: "false"
          - name: DD_DOGSTATSD_SOCKET
            value: "/var/run/datadog/dsd.socket"
          - name: DD_ORCHESTRATOR_EXPLORER_ENABLED
            value: "true"        
        volumeMounts:
          - name: config
            mountPath: /etc/datadog-agent
            readOnly: true
          - name: auth-token
            mountPath: /etc/datadog-agent/auth
            readOnly: true
          - name: dsdsocket
            mountPath: /var/run/datadog
            readOnly: false # Need RW for UDS DSD socket
          - name: logdatadog
            mountPath: /var/log/datadog
            readOnly: false # Need RW to write logs
          - name: tmpdir
            mountPath: /tmp
            readOnly: false # Need RW to write to tmp directory

          - name: os-release-file
            mountPath: /host/etc/os-release
            mountPropagation: None
            readOnly: true

          - name: runtimesocketdir
            mountPath: /host/var/run/containerd
            mountPropagation: None
            readOnly: true

          - name: cgroups
            mountPath: /host/sys/fs/cgroup
            mountPropagation: None
            readOnly: true
          - name: passwd
            mountPath: /etc/passwd
            readOnly: true
          - name: procdir
            mountPath: /host/proc
            mountPropagation: None
            readOnly: true
          - name: kubelet-ca
            mountPath: /var/run/kubelet-ca/kubeletserver.crt
            readOnly: true
          - mountPath: /etc/jboss
            name: jboss-client
            readOnly: false
      initContainers:

      - name: init-volume
        image: "gcr.io/datadoghq/agent:7.43.1-jmx"
        imagePullPolicy: IfNotPresent
        command: ["bash", "-c"]
        args:
          - cp -r /etc/datadog-agent /opt
        volumeMounts:
          - name: config
            mountPath: /opt/datadog-agent
            readOnly: false # Need RW for config path
        resources:
          {}
      - name: init-config
        image: "gcr.io/datadoghq/agent:7.43.1-jmx"
        imagePullPolicy: IfNotPresent
        command:
          - bash
          - -c
        args:
          - for script in $(find /etc/cont-init.d/ -type f -name '*.sh' | sort) ; do bash $script ; done
        volumeMounts:
          - name: logdatadog
            mountPath: /var/log/datadog
            readOnly: false # Need RW to write logs
          - name: config
            mountPath: /etc/datadog-agent
            readOnly: false # Need RW for config path
          - name: procdir
            mountPath: /host/proc
            mountPropagation: None
            readOnly: true

          - name: runtimesocketdir
            mountPath: /host/var/run/containerd
            mountPropagation: None
            readOnly: true
        env:
          # Needs to be removed when Agent N-2 is built with Golang 1.17
          - name: GODEBUG
            value: x509ignoreCN=0
          - name: DD_API_KEY
            valueFrom:
              secretKeyRef:
                name: "datadog"
                key: api-key
          - name: DD_AUTH_TOKEN_FILE_PATH
            value: /etc/datadog-agent/auth/token

          - name: DD_CLUSTER_NAME
            value: "dev"
          - name: KUBERNETES
            value: "yes"
          - name: DD_SITE
            value: "datadoghq.eu"
          - name: DD_DD_URL
            value: "https://app.datadoghq.eu"
          - name: DD_KUBERNETES_KUBELET_HOST
            valueFrom:
              fieldRef:
                fieldPath: spec.nodeName
          - name: DD_KUBELET_CLIENT_CA
            value: /var/run/kubelet-ca/kubeletserver.crt
          - name: DD_ENV
            value: "dev"
          - name: DD_CLOUD_PROVIDER_METADATA
            value: "azure"
          - name: DD_CRI_SOCKET_PATH
            value: /host/var/run/containerd/containerd.sock

        resources:
          {}
      volumes:
      - name: auth-token
        emptyDir: {}
      - name: installinfo
        configMap:
          name: datadog-installinfo
      - name: config
        emptyDir: {}

      - name: logdatadog
        emptyDir: {}
      - name: tmpdir
        emptyDir: {}
      - hostPath:
          path: /proc
        name: procdir
      - hostPath:
          path: /sys/fs/cgroup
        name: cgroups
      - hostPath:
          path: /etc/os-release
        name: os-release-file
      - hostPath:
          path: /var/run/datadog/
          type: DirectoryOrCreate
        name: dsdsocket
      - hostPath:
          path: /etc/kubernetes/certs/kubeletserver.crt
          type: File
        name: kubelet-ca
      - hostPath:
          path: /var/run/datadog/
          type: DirectoryOrCreate
        name: apmsocket
      - name: s6-run
        emptyDir: {}
      - hostPath:
          path: /etc/passwd
        name: passwd
      - hostPath:
          path: /var/lib/datadog-agent/logs
        name: pointerdir
      - hostPath:
          path: /var/log/pods
        name: logpodpath
      - hostPath:
          path: /var/log/containers
        name: logscontainerspath
      - hostPath:
          path: /var/run/containerd
        name: runtimesocketdir
      - azureFile:
          readOnly: false
          secretName: datadog-storage-secret
          shareName: jboss
        name: jboss-client
      tolerations:
      affinity:
        {}
      serviceAccountName: "datadog"
      automountServiceAccountToken: true
      nodeSelector:
        kubernetes.io/os: linux
  updateStrategy:
    rollingUpdate:
      maxUnavailable: 10%
    type: RollingUpdate
---
# Source: datadog/templates/agent-clusterchecks-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: datadog-clusterchecks
  namespace: datadog
  labels:
    helm.sh/chart: 'datadog-3.23.0'
    app.kubernetes.io/name: "datadog"
    app.kubernetes.io/instance: "datadog"
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/version: "7"
    app.kubernetes.io/component: clusterchecks-agent

spec:
  replicas: 1
  revisionHistoryLimit: 10
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
    type: RollingUpdate
  selector:
    matchLabels:
      app: datadog-clusterchecks
  template:
    metadata:
      labels:
        app.kubernetes.io/name: "datadog"
        app.kubernetes.io/instance: "datadog"
        app.kubernetes.io/managed-by: Helm
        app.kubernetes.io/component: clusterchecks-agent
        app: datadog-clusterchecks

      name: datadog-clusterchecks
      annotations:
        checksum/clusteragent_token: __REDACTED__
        checksum/api_key: __REDACTED__
        checksum/install_info: __REDACTED__
    spec:
      serviceAccountName: datadog-cluster-checks
      automountServiceAccountToken: true
      imagePullSecrets:
        []
      initContainers:
      - name: init-volume
        image: "gcr.io/datadoghq/agent:7.43.1-jmx"
        imagePullPolicy: IfNotPresent
        command: ["bash", "-c"]
        args:
          - cp -r /etc/datadog-agent /opt
        volumeMounts:
          - name: config
            mountPath: /opt/datadog-agent
            readOnly: false # Need RW for writing agent config files
        resources:
          {}
      - name: init-config
        image: "gcr.io/datadoghq/agent:7.43.1-jmx"
        imagePullPolicy: IfNotPresent
        command: ["bash", "-c"]
        args:
          - for script in $(find /etc/cont-init.d/ -type f -name '*.sh' | sort) ; do bash $script ; done
        volumeMounts:
          - name: config
            mountPath: /etc/datadog-agent
            readOnly: false # Need RW for writing datadog.yaml config file
        resources:
          {}
      containers:
      - name: agent
        image: "gcr.io/datadoghq/agent:7.43.1-jmx"
        command: ["bash", "-c"]
        args:
          - rm -rf /etc/datadog-agent/conf.d && touch /etc/datadog-agent/datadog.yaml && exec agent run
        imagePullPolicy: IfNotPresent
        env:
          - name: DD_API_KEY
            valueFrom:
              secretKeyRef:
                name: "datadog"
                key: api-key
          - name: KUBERNETES
            value: "yes"
          - name: DD_SITE
            value: "datadoghq.eu"
          - name: DD_DD_URL
            value: "https://app.datadoghq.eu"
          - name: DD_LOG_LEVEL
            value: "INFO"
          - name: DD_EXTRA_CONFIG_PROVIDERS
            value: "clusterchecks"
          - name: DD_HEALTH_PORT
            value: "5557"
          # Cluster checks (cluster-agent communication)
          - name: DD_CLUSTER_AGENT_ENABLED
            value: "true"
          - name: DD_CLUSTER_AGENT_KUBERNETES_SERVICE_NAME
            value: datadog-cluster-agent
          - name: DD_CLUSTER_AGENT_AUTH_TOKEN
            valueFrom:
              secretKeyRef:
                  name: datadog-cluster-agent
                  key: token

          # Safely run alongside the daemonset
          - name: DD_ENABLE_METADATA_COLLECTION
            value: "false"
          # Expose CLC stats
          - name: DD_CLC_RUNNER_ENABLED
            value: "true"
          - name: DD_CLC_RUNNER_HOST
            valueFrom:
              fieldRef:
                fieldPath: status.podIP
          - name: DD_CLC_RUNNER_ID
            valueFrom:
              fieldRef:
                fieldPath: metadata.name
          # Remove unused features
          - name: DD_USE_DOGSTATSD
            value: "false"
          - name: DD_PROCESS_AGENT_ENABLED
            value: "false"
          - name: DD_LOGS_ENABLED
            value: "false"
          - name: DD_APM_ENABLED
            value: "false"
          - name: DD_HOSTNAME
            valueFrom:
              fieldRef:
                fieldPath: spec.nodeName
          - name: DD_CLUSTER_NAME
            value: "dev"                    
        resources:
          {}
        volumeMounts:
          - name: installinfo
            subPath: install_info
            mountPath: /etc/datadog-agent/install_info
            readOnly: true
          - name: config
            mountPath: /etc/datadog-agent
            readOnly: false # Need RW for config path
        livenessProbe:
          failureThreshold: 6
          httpGet:
            path: /live
            port: 5557
            scheme: HTTP
          initialDelaySeconds: 15
          periodSeconds: 15
          successThreshold: 1
          timeoutSeconds: 5
        readinessProbe:
          failureThreshold: 6
          httpGet:
            path: /ready
            port: 5557
            scheme: HTTP
          initialDelaySeconds: 15
          periodSeconds: 15
          successThreshold: 1
          timeoutSeconds: 5
      volumes:
        - name: installinfo
          configMap:
            name: datadog-installinfo
        - name: config
          emptyDir: {}
      affinity:
        # Prefer scheduling the runners on different nodes if possible
        # for better checks stability in case of node failure.
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 50
            podAffinityTerm:
              labelSelector:
                matchLabels:
                  app: datadog-clusterchecks
              topologyKey: kubernetes.io/hostname
      nodeSelector:
        kubernetes.io/os: linux
---
# Source: datadog/templates/cluster-agent-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: datadog-cluster-agent
  namespace: datadog
  labels:
    helm.sh/chart: 'datadog-3.23.0'
    app.kubernetes.io/name: "datadog"
    app.kubernetes.io/instance: "datadog"
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/version: "7"
    app.kubernetes.io/component: cluster-agent

spec:
  replicas: 1
  revisionHistoryLimit: 10
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
    type: RollingUpdate
  selector:
    matchLabels:
      app: datadog-cluster-agent
  template:
    metadata:
      labels:
        app.kubernetes.io/name: "datadog"
        app.kubernetes.io/instance: "datadog"
        app.kubernetes.io/managed-by: Helm
        app.kubernetes.io/component: cluster-agent
        app: datadog-cluster-agent

      name: datadog-cluster-agent
      annotations:
        checksum/clusteragent_token: __REDACTED__
        checksum/clusteragent-configmap: __REDACTED__
        checksum/api_key: __REDACTED__
        checksum/application_key: __REDACTED__
        checksum/install_info: __REDACTED__
        ad.datadoghq.com/cluster-agent.logs: |
          [{"source": "datadog-cluster-agent","service": "datadog-cluser-agent","log_processing_rules": [{"type": "multi_line", "name": "log_start_with_special_character","pattern": "^[0-9\\-\\s:]+ [A-Z]{3} \\|"}]}]

    spec:
      serviceAccountName: datadog-cluster-agent
      automountServiceAccountToken: true
      initContainers:
      - name: init-volume
        image: "gcr.io/datadoghq/cluster-agent:7.43.1"
        imagePullPolicy: IfNotPresent
        command:
          - cp
        args:
          - -r
          - /etc/datadog-agent
          - /opt
        volumeMounts:
          - name: config
            mountPath: /opt/datadog-agent
      containers:
      - name: cluster-agent
        image: "gcr.io/datadoghq/cluster-agent:7.43.1"
        imagePullPolicy: IfNotPresent
        resources:
          {}
        ports:
        - containerPort: 5005
          name: agentport
          protocol: TCP
        - containerPort: 5000
          name: agentmetrics
          protocol: TCP
        - containerPort: 8443
          name: metricsapi
          protocol: TCP
        env:
          - name: DD_HEALTH_PORT
            value: "5556"
          - name: DD_API_KEY
            valueFrom:
              secretKeyRef:
                name: "datadog"
                key: api-key
                optional: true

          - name: DD_CLUSTER_NAME
            value: "dev"
          - name: KUBERNETES
            value: "yes"
          - name: DD_SITE
            value: "datadoghq.eu"
          - name: DD_DD_URL
            value: "https://app.datadoghq.eu"
          - name: DD_APP_KEY
            valueFrom:
              secretKeyRef:
                name: "datadog-appkey"
                key: app-key
          - name: DD_EXTERNAL_METRICS_PROVIDER_ENABLED
            value: "true"
          - name: DD_EXTERNAL_METRICS_PROVIDER_PORT
            value: "8443"
          - name: DD_EXTERNAL_METRICS_PROVIDER_WPA_CONTROLLER
            value: "false"
          - name: DD_EXTERNAL_METRICS_PROVIDER_USE_DATADOGMETRIC_CRD
            value: "true"
          - name: DD_EXTERNAL_METRICS_AGGREGATOR
            value: "avg"
          - name: DD_ADMISSION_CONTROLLER_ENABLED
            value: "true"
          - name: DD_ADMISSION_CONTROLLER_MUTATE_UNLABELLED
            value: "false"
          - name: DD_ADMISSION_CONTROLLER_SERVICE_NAME
            value: datadog-cluster-agent-admission-controller
          - name: DD_ADMISSION_CONTROLLER_INJECT_CONFIG_MODE
            value: socket
          - name: DD_ADMISSION_CONTROLLER_INJECT_CONFIG_LOCAL_SERVICE_NAME
            value: datadog
          - name: DD_ADMISSION_CONTROLLER_ADD_AKS_SELECTORS
            value: "true"
          - name: DD_ADMISSION_CONTROLLER_FAILURE_POLICY
            value: "Ignore"
          - name: DD_CLUSTER_CHECKS_ENABLED
            value: "true"
          - name: DD_EXTRA_CONFIG_PROVIDERS
            value: "kube_endpoints kube_services"
          - name: DD_EXTRA_LISTENERS
            value: "kube_endpoints kube_services"
          - name: DD_LOG_LEVEL
            value: "INFO"
          - name: DD_LEADER_ELECTION
            value: "true"
          - name: DD_LEADER_LEASE_DURATION
            value: "15"
          - name: DD_LEADER_LEASE_NAME
            value: datadog-leader-election
          - name: DD_CLUSTER_AGENT_TOKEN_NAME
            value: datadogtoken
          - name: DD_COLLECT_KUBERNETES_EVENTS
            value: "true"
          - name: DD_CLUSTER_AGENT_KUBERNETES_SERVICE_NAME
            value: datadog-cluster-agent
          - name: DD_CLUSTER_AGENT_AUTH_TOKEN
            valueFrom:
              secretKeyRef:
                name: datadog-cluster-agent
                key: token
          - name: DD_CLUSTER_AGENT_COLLECT_KUBERNETES_TAGS
            value: "false"
          - name: DD_KUBE_RESOURCES_NAMESPACE
            value: datadog
          - name: CHART_RELEASE_NAME
            value: "datadog"
          - name: AGENT_DAEMONSET
            value: datadog
          - name: CLUSTER_AGENT_DEPLOYMENT
            value: datadog-cluster-agent
          - name: DD_ORCHESTRATOR_EXPLORER_ENABLED
            value: "true"
          - name: DD_ORCHESTRATOR_EXPLORER_CONTAINER_SCRUBBING_ENABLED
            value: "true"          
          - name: DATADOG_HOST
            value: "https://api.datadoghq.eu/"
          - name: DD_EXTERNAL_METRICS_PROVIDER_MAX_AGE
            value: "900"
          - name: DD_EXTERNAL_METRICS_PROVIDER_BUCKET_SIZE
            value: "900"          
        livenessProbe:
          failureThreshold: 6
          httpGet:
            path: /live
            port: 5556
            scheme: HTTP
          initialDelaySeconds: 15
          periodSeconds: 15
          successThreshold: 1
          timeoutSeconds: 5
        readinessProbe:
          failureThreshold: 6
          httpGet:
            path: /ready
            port: 5556
            scheme: HTTP
          initialDelaySeconds: 15
          periodSeconds: 15
          successThreshold: 1
          timeoutSeconds: 5
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            add:
            - NET_BIND_SERVICE
          readOnlyRootFilesystem: true
          runAsNonRoot: true
          runAsUser: 101
        volumeMounts:
          - name: datadogrun
            mountPath: /opt/datadog-agent/run
            readOnly: false
          - name: varlog
            mountPath: /var/log/datadog
            readOnly: false
          - name: tmpdir
            mountPath: /tmp
            readOnly: false
          - name: installinfo
            subPath: install_info
            mountPath: /etc/datadog-agent/install_info
            readOnly: true
          - name: confd
            mountPath: /conf.d
            readOnly: true
          - name: config
            mountPath: /etc/datadog-agent
      volumes:
        - name: datadogrun
          emptyDir: {}
        - name: varlog
          emptyDir: {}
        - name: tmpdir
          emptyDir: {}
        - name: installinfo
          configMap:
            name: datadog-installinfo
        - name: confd
          configMap:
            name: datadog-cluster-agent-confd
            items:
            - key: kubernetes_state_core.yaml.default
              path: kubernetes_state_core.yaml.default
        - name: config
          emptyDir: {}
      affinity:
        # Prefer scheduling the cluster agents on different nodes
        # to guarantee that the standby instance can immediately take the lead from a leader running of a faulty node.
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 50
            podAffinityTerm:
              labelSelector:
                matchLabels:
                  app: datadog-cluster-agent
              topologyKey: kubernetes.io/hostname
      nodeSelector:
        kubernetes.io/os: linux
---
# Source: datadog/templates/agent-apiservice.yaml
apiVersion: apiregistration.k8s.io/v1
kind: APIService
metadata:
  name: v1beta1.external.metrics.k8s.io
  labels:
    helm.sh/chart: 'datadog-3.23.0'
    app.kubernetes.io/name: "datadog"
    app.kubernetes.io/instance: "datadog"
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/version: "7"
spec:
  service:
    name: datadog-cluster-agent-metrics-api
    namespace: datadog
    port: 8443
  version: v1beta1
  insecureSkipTLSVerify: true
  group: external.metrics.k8s.io
  groupPriorityMinimum: 100
  versionPriority: 100
CharlyF commented 1 year ago

Thank you this is very helpful. So it seems indeed that the check is mounted in the Cluster Agent and it should work. (it also seems like you are using the Cluster Check Runners, more on that later). With the current setup, could you use agent status in the Datadog Cluster Agent (in the leader, the agent status command in a Cluster Agent pod will output the pod_name of the leader at the bottom if it's a follower). I would expect to see something like this:

    kubernetes_state_core
    ---------------------
      Instance ID: kubernetes_state_core:9ecd386c7c831a27 [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/kubernetes_state_core.d/kubernetes_state_core.yaml.default
      Total Runs: 5,393
      Metric Samples: Last Run: 1,356, Total: 7,298,705
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 12, Total: 64,656
      Average Execution Time : 19ms
      Last Execution Date : 2023-03-30 12:57:59 UTC (1680181079000)
      Last Successful Execution Date : 2023-03-30 12:57:59 UTC (1680181079000)

The logs of the pod could also be helpful. I want to solve this first, but eventually, I would also recommend using: datadog.kubeStateMetricsCore.useClusterCheckRunners, since this is the best practice to distribute the check onto the cluster check runners.

PSanetra commented 1 year ago

@CharlyF I have now enabled the legacy kube state metrics again, but when I still had kubeStateMetricsCore enabled, I already checked that command and that kubernetes_state_core check was missing in the output. The file in /etc/datadog-agent/conf.d/kubernetes_state_core.d/kubernetes_state_core.yaml.default was missing, too, but was present in /conf.d/kubernetes_state_core.yaml.default. Does this make sense?

I can check it again when I have time.

CharlyF commented 1 year ago

Understood - I am not able to reproduce unfortunately so we'd be happy to dig more into it.

In the meantime, we released the Datadog Operator 1.0, if you want to give this a try, I believe the DatadogAgent you would need to get the same featureset and config is:

kind: DatadogAgent
apiVersion: datadoghq.com/v2alpha1
metadata:
  name: datadog
  namespace: datadog
spec:
  features:
    orchestratorExplorer:
      enabled: true
      scrubContainers: true
    clusterChecks:
      enabled: true
      useClusterChecksRunners: true
    externalMetricsServer:
      enabled: true
      useDatadogMetrics: false
    admissionController:
      enabled: true
    logCollection:
      enabled: true
      containerCollectAll: true
    kubeStateMetricsCore:
      enabled: true
  global:
    credentials:
      apiKey: <DATADOG_API_KEY>
      appKey: <DATADOG_APP_KEY>
  override:
    nodeAgent:
      image:
        jmxEnabled: true
NasAmin commented 1 year ago

Any update on this? I am using the latest'ish chart (v3.25.5). I am not able to see any kubernetes_state.* metrics. My additional difference is I am forwarding metrics to Vector which then pushes it to Datadog.

willianccs commented 1 year ago

I'm facing a similar problem after updated the helm-chart version from datadog-3.25.0 to datadog-3.29.3. The metric kubernetes_state.hpa.desired_replicas have disappear. image

I need to rollback the cluster-agent iamge to version 7.43.1 to solve the problem.

hsalluri259 commented 1 year ago

@PSanetra

I was facing the same issue for last two days. I was missing the below flag in my helm values due to which KSM core was missing in my clusterchecks pod output. Once I set this flag enabled: true, I was able to see KSM core in leader clusterchecks pod.
https://github.com/DataDog/helm-charts/blob/main/charts/datadog/values.yaml#L217
https://github.com/DataDog/helm-charts/blob/e3133172449038caaca4c18342fecd2976be377a/charts/datadog/values.yaml#LL1672C5-L1672C5 
✗ k exec -it datadog-clusterchecks-6c5cdf9998-ns8pr agent status 
=========
Collector
=========

  Running Checks
  ==============

    kubernetes_state_core
    ---------------------
      Instance ID: kubernetes_state_core:32616faf0d26bb34 [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/kubernetes_state_core.yaml.default
      Total Runs: 108
      Metric Samples: Last Run: 4,177, Total: 451,160
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 18, Total: 1,944
      Average Execution Time : 29ms
      Last Execution Date : 2023-06-16 22:11:36 UTC (1686953496000)
      Last Successful Execution Date : 2023-06-16 22:11:36 UTC (1686953496000)
willianccs commented 1 year ago

Related issue: https://github.com/DataDog/datadog-agent/issues/17843

superkartoffel commented 5 months ago

I noticed this information log from the cluster-agent after startup:

cp: cannot create regular file '/etc/datadog-agent/conf.d/kubernetes_state_core.yaml.default': Permission denied
'/conf.d/kubernetes_state_core.yaml.default' -> '/etc/datadog-agent/conf.d/kubernetes_state_core.yaml.default'
cp: cannot create regular file '/etc/datadog-agent/conf.d/kubernetes_state_core.yaml.default': Permission denied
'/conf.d/kubernetes_state_core.yaml.default' -> '/etc/datadog-agent/conf.d/kubernetes_state_core.yaml.default'

the file permissions for this folder are set so that only root has write access:

# ls -al /etc/datadog-agent/       
total 92
drwxrwxrwx  5 root     root  4096 Apr 11 08:37 .
drwxr-sr-x 35 root     root  4096 Mar 20 19:08 ..
-rw-------  1 dd-agent root    64 Apr 11 08:37 auth_token
drwxr-xr-x  2 root     root  4096 Apr 11 08:37 certificates
drwxr-xr-x  2 root     root 12288 Apr 11 08:37 compliance.d
drwxr-xr-x  4 root     root  4096 Apr 11 08:37 conf.d
-rw-r--r--  1 root     root 53989 Apr 11 08:37 datadog-cluster.yaml
-rw-r--r--  1 root     root    90 Apr 11 08:37 install_info

So a workaround for this issue is to run the agent as root. In this helm chart that means to not set runAsUser: 101 in the security context (https://docs.datadoghq.com/containers/kubernetes/installation/?tab=helm#unprivileged)

superkartoffel commented 5 months ago

sorry, I should clarify. datadog.securityContext only affects the daemonset (the datadog agents), as far as I understand. Therefore we set clusterAgent.containers.clusterAgent.securityContext as well, which does not work.

You can however set clusterAgent.securityContext which will set the securityContext on both the clusteragent and its initcontainer.

Jrc356 commented 2 days ago

@superkartoffel did setting the clusterAgent.securityContext solve the copy error?

I am also not getting any core metrics but it looks like the agent isn't running the check because of the failed cp

tomas-checkatrade commented 1 day ago

having exactly the same issue due to the init container failing to copy configuration. Latest helm chart has this issue.

cp: cannot create regular file '/etc/datadog-agent/conf.d/kubernetes_state_core.yaml.default': Permission denied

For anyone looking for a solution, this is the fix:

clusterAgent.securityContext.runAsUser: 100