Open karthikeayan opened 2 years ago
I'm having a similar problem here, although slightly different scenario. Wanting to only capture Postgres metrics, but am finding that the agent is capturing system metrics despite removing everything else in conf.d.
We are following the same idea, to have an agent running in eks to only do the rds checks.
Agent status looks good so far:
kubectl exec -it
===============
Agent (v7.37.1)
===============
=========
Collector
=========
Running Checks
==============
postgres (12.4.0)
-----------------
Instance ID: postgres:6cb55c36780909a7 [OK]
Configuration Source: file:/etc/datadog-agent/conf.d/postgres.d/conf.yaml
Total Runs: 10
Metric Samples: Last Run: 305, Total: 2,881
Events: Last Run: 0, Total: 0
Database Monitoring Activity Samples: Last Run: 1, Total: 13
Database Monitoring Query Metrics: Last Run: 2, Total: 14
Database Monitoring Query Samples: Last Run: 3, Total: 237
Service Checks: Last Run: 1, Total: 10
Average Execution Time : 35ms
Last Execution Date : 2022-07-07 09:19:08 UTC (1657185548000)
Last Successful Execution Date : 2022-07-07 09:19:08 UTC (1657185548000)
metadata:
version.major: 12
version.minor: 8
version.patch: 0
version.raw: 12.8
version.scheme: semver
We also remove the standard checks with a little bit of force as there is no Variable to toggle this:
lifecycle:
postStart:
exec:
command: ["/bin/sh", "-c", "find /etc/datadog-agent/conf.d/ -iname *.yaml.default -delete"]
So far so good.
I am now trying to fix these issues from the agents logs
2022-07-07 09:16:49 UTC | CORE | WARN | (pkg/util/log/log.go:591 in func1) | Agent configuration relax permissions constraint on the secret backend cmd, Group can read and exec
WARNING: `--config` argument is deprecated and will be removed in a future version. Please use `--cfgpath` instead.
2022-07-07 09:16:49 UTC | PROCESS | WARN | (pkg/util/log/log.go:591 in func1) | Agent configuration relax permissions constraint on the secret backend cmd, Group can read and exec
2022-07-07 09:16:49 UTC | PROCESS | WARN | (pkg/util/log/log.go:591 in func1) | Agent configuration relax permissions constraint on the secret backend cmd, Group can read and exec
2022-07-07 09:16:49 UTC | SYS-PROBE | WARN | (pkg/util/log/log.go:591 in func1) | Error loading config: open /etc/datadog-agent/system-probe.yaml: no such file or directory
2022-07-07 09:16:49 UTC | SYS-PROBE | WARN | (pkg/util/log/log.go:591 in func1) | Agent configuration relax permissions constraint on the secret backend cmd, Group can read and exec
2022-07-07 09:16:49 UTC | SECURITY | WARN | (pkg/util/log/log.go:591 in func1) | Agent configuration relax permissions constraint on the secret backend cmd, Group can read and exec
2022-07-07 09:16:51 UTC | CORE | WARN | (pkg/serializer/serializer.go:144 in NewSerializer) | event payloads are disabled: all events will be dropped
2022-07-07 09:16:51 UTC | CORE | WARN | (pkg/serializer/serializer.go:147 in NewSerializer) | series payloads are disabled: all series will be dropped
2022-07-07 09:16:51 UTC | CORE | WARN | (pkg/serializer/serializer.go:150 in NewSerializer) | service_checks payloads are disabled: all service_checks will be dropped
2022-07-07 09:16:51 UTC | CORE | WARN | (pkg/serializer/serializer.go:153 in NewSerializer) | sketches payloads are disabled: all sketches will be dropped
2022-07-07 09:16:51 UTC | CORE | WARN | (pkg/secrets/secrets.go:50 in Init) | Agent configuration relax permissions constraint on the secret backend cmd, Group can read and exec
2022-07-07 09:16:52 UTC | CORE | WARN | (pkg/autodiscovery/providers/config_reader.go:156 in read) | Skipping, open /opt/datadog-agent/bin/agent/dist/conf.d: no such file or directory
2022-07-07 09:16:52 UTC | CORE | WARN | (pkg/autodiscovery/providers/config_reader.go:156 in read) | Skipping, open : no such file or directory
2022-07-07 09:16:52 UTC | CORE | ERROR | (pkg/collector/scheduler.go:76 in Schedule) | Unable to run Check postgres: a check with ID postgres:6cb55c36780909a7 is already running
2022-07-07 09:16:53 UTC | CORE | WARN | (pkg/util/cloudproviders/gce/gce_tags.go:50 in getCachedTags) | unable to get tags from gce and cache is empty: GCE metadata API error: status code 401 trying to GET http://169.254.169.254/computeMetadata/v1/?recursive=true
2022-07-07 09:16:53 UTC | TRACE | WARN | (pkg/util/log/log.go:591 in func1) | Agent configuration relax permissions constraint on the secret backend cmd, Group can read and exec
system-probe exited with code 0, disabling
trace-agent exited with code 0, disabling
2022-07-07 09:17:21 UTC | CORE | ERROR | (pkg/metrics/iterable_series.go:55 in Append) | Cannot append a serie in a closed buffered channel
2022-07-07 09:19:13 UTC | PROCESS | WARN | (pkg/util/cloudproviders/gce/gce_tags.go:50 in getCachedTags) | unable to get tags from gce and cache is empty: GCE metadata API error: status code 401 trying to GET http://169.254.169.254/computeMetadata/v1/?recursive=true
2022-07-07 09:19:36 UTC | CORE | ERROR | (pkg/metrics/iterable_series.go:55 in Append) | Cannot append a serie in a closed buffered channel
UPDATE after enabling DD_ENABLE_PAYLOADS_SERIES these errors went away
2022-07-07 09:19:36 UTC | CORE | ERROR | (pkg/metrics/iterable_series.go:55 in Append) | Cannot append a serie in a closed buffered channel
setting
2022-07-07 09:19:13 UTC | PROCESS | WARN | (pkg/util/cloudproviders/gce/gce_tags.go:50 in getCachedTags) | unable to get tags from gce and cache is empty: GCE metadata API error: status code 401 trying to GET
liveness Probe got rid of this
2022-07-07 10:35:04 UTC | CORE | ERROR | (pkg/collector/scheduler.go:76 in Schedule) | Unable to run Check postgres: a check with ID postgres:6cb55c36780909a7 is already running
Update: setting
2022-07-07 09:16:49 UTC | SYS-PROBE | WARN | (pkg/util/log/log.go:591 in func1) | Agent configuration relax permissions constraint on the secret backend cmd, Group can read and exec
Might be more of an implementation detail, but because the datadog-agent
container uses s6
, there are some hooks where a user can dynamically mount shell scripts into /etc/cont-init.d
which would have more of a guaranteed order of execution than what is provided by postStart
:
There is no guarantee, however, that the postStart handler is called before the Container's entrypoint is called
So, the solution we took was to define a 99-delete-default-checks.sh
with the same contents and mount it there.
Would it be a useful feature to consider adding this as an init script and then exposing it via a DD_DISABLE_DEFAULT_CHECKS
(or something like it), environment variable?
@clatour could you please explain in details what is the content of the script 99-delete-default-checks.sh and for me, I'm using a helm chart to install datadog on k8s to just scrape mysql metrics, and I'm getting unwanted k8s metrics that I need to turn off tried this but it didn't worked
--set 'datadog.kubeStateMetricsCore.enabled=false' \
--set 'kube-state-metrics.serviceAccount.create=false' \
Describe what happened: Unable to deploy Datadog Container Agent as pod with only custom checks.
I have deployed Datadog Kubernetes Helm Chart in the Kubernetes cluster. Datadog created a daemonset and deployed a pod in each node and pulls metrics from each node. I also want to deploy another Datadog agent as a pod that runs only the custom checks like mysql, postgres. It should not collect metrics of the host it is running. As host metrics will be collected with Daemonset.
Host metrics are tagged to the new host with Kubernetes pod name.
When I follow this, https://docs.datadoghq.com/logs/guide/how-to-set-up-only-logs/?tab=kubernetes, no metrics are sent to Datadog, host metric and the custom check metrics.
Describe what you expected: Host should not appear in infrastructure list.
Steps to reproduce the issue: Deploy Datadog Helm Chart Create deployment with below values