DataDog / helm-charts

Helm charts for Datadog products
Apache License 2.0
348 stars 1.02k forks source link

[redisdb] checks autodiscovery fail #520

Open millad90s opened 2 years ago

millad90s commented 2 years ago

following this page , I enabled auto discovery on Datadog then I installed redis via helm chart on kubernetes and added podlabels and podAnnotations into the helm values. Datadog Agent get labels perfectly and also get annotations except ad.datadoghq.com/redis.instances: '[{"ad_identifiers": "redis","host": "%%host%%","port":"6378","password":"%%env_REDIS_PASSWORD%%"}]', } I also tried ignoreAutoConfig: [ "redisdb" ], but Datadog Agent totally ignored redisdb check.

Describe what you expected: Datadog Agent is supposed to get redis check config from Annotations and connect to it.

But

Datadog Agent still reading it's own default config for redisdb (conf.d/redisdb.d/auto_conf.yaml) and tries to connect to port 6379. while my redis is running on port 6378.

Steps to reproduce the issue:

ad_identifiers:
  - redis
init_config:
instances:
  - host: '%%host%%'
  - port: 6379

also run this command to check services: agent status

redis-test/redis-master-0/redis

- Type: file
  Identifier: cab8caa461e93b7e7081c9011ad62bded73aafb29b7940a930db8696e415d12b
  Path: /var/log/pods/redis-test_redis-master-0_fcbb1a32-313d-48a7-a6e4-340127719b90/redis/*.log
  Status: OK
    1 files tailed out of 1 files matching
  Inputs:
    /var/log/pods/redis-test_redis-master-0_fcbb1a32-313d-48a7-a6e4-340127719b90/redis/0.log
  BytesRead: 0
  Average Latency (ms): 0
  24h Average Latency (ms): 0
  Peak Latency (ms): 0
  24h Peak Latency (ms): 0`

` redisdb (4.2.0)

  Instance ID: redisdb:89976314c9e14a65 [ERROR]
  Configuration Source: file:/etc/datadog-agent/conf.d/redisdb.d/auto_conf.yaml
  Total Runs: 2
  Metric Samples: Last Run: 0, Total: 0
  Events: Last Run: 0, Total: 0
  Service Checks: Last Run: 1, Total: 2
  Average Execution Time : 24ms
  Last Execution Date : 2022-01-20 08:40:20 UTC (1642668020000)
  Last Successful Execution Date : Never
  Error: Error 111 connecting to 10.110.34.149:6379. Connection refused.
  Traceback (most recent call last):
    File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/connection.py", line 559, in connect
      sock = self._connect()
    File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/connection.py", line 615, in _connect
      raise err
    File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/connection.py", line 603, in _connect
      sock.connect(socket_address)
  ConnectionRefusedError: [Errno 111] Connection refused`

Additional environment details (Operating System, Cloud provider, etc): I tested this issue on both EKS and GKE. I enabled auto discovery on Datadog then I installed redis via helm chart on kubernetes and added podlabels and podAnnotations into the helm values. Datadog Agent get labels perfectly and also get annotations except ad.datadoghq.com/redis.instances: '[{"ad_identifiers": "redis","host": "%%host%%","port":"6378","password":"%%env_REDIS_PASSWORD%%"}]', } I also tried ignoreAutoConfig: [ "redisdb" ], but Datadog Agent totally ignored redisdb check.

Describe what you expected: Datadog Agent is supposed to get redis check config from Annotations and connect to it.

But

Datadog Agent still reading it's own default config for redisdb (conf.d/redisdb.d/auto_conf.yaml) and tries to connect to port 6379. while my redis is running on port 6378.

Steps to reproduce the issue:

ad_identifiers:
  - redis

## All options defined here are available to all instances.

init_config:

`## Every instance is scheduled independent of the others.
`
instances:

    ## @param host - string - required
    ## Enter the host to connect to.
    #
  - host: '%%host%%'

param port - integer - required

## Enter the port of the host to connect to.
#

    port: 6379`
* also run this command to check services: agent status
`  redis-test/redis-master-0/redis
  -------------------------------
    - Type: file
      Identifier: cab8caa461e93b7e7081c9011ad62bded73aafb29b7940a930db8696e415d12b
      Path: /var/log/pods/redis-test_redis-master-0_fcbb1a32-313d-48a7-a6e4-340127719b90/redis/*.log
      Status: OK
        1 files tailed out of 1 files matching
      Inputs:
        /var/log/pods/redis-test_redis-master-0_fcbb1a32-313d-48a7-a6e4-340127719b90/redis/0.log
      BytesRead: 0
      Average Latency (ms): 0
      24h Average Latency (ms): 0
      Peak Latency (ms): 0
      24h Peak Latency (ms): 0`

`    redisdb (4.2.0)
    ---------------
      Instance ID: redisdb:89976314c9e14a65 [ERROR]
      Configuration Source: file:/etc/datadog-agent/conf.d/redisdb.d/auto_conf.yaml
      Total Runs: 2
      Metric Samples: Last Run: 0, Total: 0
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 1, Total: 2
      Average Execution Time : 24ms
      Last Execution Date : 2022-01-20 08:40:20 UTC (1642668020000)
      Last Successful Execution Date : Never
      Error: Error 111 connecting to 10.110.34.149:6379. Connection refused.
      Traceback (most recent call last):
        File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/connection.py", line 559, in connect
          sock = self._connect()
        File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/connection.py", line 615, in _connect
          raise err
        File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/connection.py", line 603, in _connect
          sock.connect(socket_address)
      ConnectionRefusedError: [Errno 111] Connection refused`
 ` 
--------
**Additional environment details (Operating System, Cloud provider, etc):**
I tested this issue on both EKS and GKE. 
helm version:
version.BuildInfo{Version:"v3.7.2", GitCommit:"663a896f4a815053445eec4153677ddc24a0a361", GitTreeState:"clean", GoVersion:"go1.16.10"}
kubectl version:
Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.2", GitCommit:"092fbfbf53427de67cac1e9fa54aaa09a28371d7", GitTreeState:"clean", BuildDate:"2021-06-16T12:59:11Z", GoVersion:"go1.16.5", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"21+", GitVersion:"v1.21.2-eks-06eac09", GitCommit:"5f6d83fe4cb7febb5f4f4e39b3b2b64ebbbe3e97", GitTreeState:"clean", BuildDate:"2021-09-13T14:20:15Z", GoVersion:"go1.16.5", Compiler:"gc", Platform:"linux/amd64"}
millad90s commented 2 years ago

I fixed this by changing annotations fromad.datadoghq.com/redis.check_name to ad.datadoghq.com/redis.check_names

but there is another issue now: datado agent is not able to get variable env_REDIS_PASSWORD !

n Datadog Agent run: agent configcheck -v output: === Resolve warnings === redisdb

* error resolving template redisdb for service docker://6a05409fb4cfb8376b9ee3ae36576c7114f1cc66076faea7d899fefe97a67118: ignoring config from file:/etc/datadog-agent/conf.d/redisdb.d/auto_conf.yaml: another config is defined for the check redisdb
* Can't resolve the template for redisdb at this moment.
* error resolving template redisdb for service docker://6a05409fb4cfb8376b9ee3ae36576c7114f1cc66076faea7d899fefe97a67118: failed to retrieve envvar REDIS_PASSWORD, skipping service docker://6a05409fb4cfb8376b9ee3ae36576c7114f1cc66076faea7d899fefe97a67118
* Can't resolve the template for redisdb at this moment.
clamoriniere commented 2 years ago

I @millad90s

Does the envvar is present in the datadog pod container agent or in your redis instance pod container?

The env var should be in the agent container.

If for security reason you don't want to expose it in the agent env var. The agent and the helm chart provide a new option to store the password in a kubernetes secret: https://docs.datadoghq.com/agent/guide/secrets-management/

Please let us know if it solve the issue. Regards

nshakhat commented 2 years ago

I have the same issue with ECS deployment. The error is the folowing: 2022-02-21 20:03:48 UTC | CORE | WARN | (pkg/autodiscovery/autoconfig.go:544 in resolveTemplateForService) | error resolving template redisdb for service docker://***a1e81-1539673652: ignoring config from file:/etc/datadog-agent/conf.d/redisdb.d/auto_conf.yaml: another config is defined for the check redisdb

Labels from app container are: DockerLabels: com.datadoghq.ad.check_names: '["redisdb"]' com.datadoghq.ad.init_configs: '[{}]' com.datadoghq.ad.instances: "[{\"host\":\"%%host%%\",\"port\":\"6379\"}]"

clamoriniere commented 2 years ago

Hi @nshakhat

It doesn't seems to be the same issue, because in the initial issue the problem was able resolving and envvar in the check configuration.

Could you share the output of the agent status and agent configcheck. Also maybe it is just a typo in you com.datadoghq.ad.instances label, but is should be com.datadoghq.ad.instances: '[{"host":"%%host%%","port":"6379"}]' instead of com.datadoghq.ad.instances: "[{"host":"%%host%%","port":"6379"}]"

Regards