DataDog / integrations-extras

Community developed integrations and plugins for the Datadog Agent.
BSD 3-Clause "New" or "Revised" License
256 stars 750 forks source link

Exim integration gives no metric data #2209

Open cw-Widad opened 12 months ago

cw-Widad commented 12 months ago

Output of the info page

    ------------
      Instance ID: exim:c933403ba680af29 [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/exim.d/conf.yaml
      Total Runs: 2,942
      Metric Samples: Last Run: 0, Total: 0
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 1, Total: 2,942
      Average Execution Time : 1ms
      Last Execution Date : 2023-11-30 13:13:41 CET / 2023-11-30 12:13:41 UTC (1701346421000)
      Last Successful Execution Date : 2023-11-30 13:13:41 CET / 2023-11-30 12:13:41 UTC (1701346421000)

Additional environment details (Operating System, Cloud provider, etc): Ubuntu 22.04.1 on Google Cloud Platform

Describe the results you received: No metrics received even when queue has messages. Results of running exim -bp | exiqsumm is

Count  Volume  Oldest  Newest  Domain
-----  ------  ------  ------  ------

  680  1312KB     69h     19h  foobar.xyz
---------------------------------------------------------------
  680  1312KB     69h     19h  TOTAL

Log files in /var/log/datadog/agent.log with log level set as debug shows these results

2023-11-30 12:38:41 CET | CORE | DEBUG | (pkg/collector/python/check.go:88 in runCheck) | Running python check exim (version: '1.0.0', id: 'exim:c933403ba680af29')
2023-11-30 12:38:41 CET | CORE | DEBUG | (pkg/collector/python/datadog_agent.go:135 in LogMessage) | exim:c933403ba680af29 | (subprocess_output.py:55) | Running get_subprocess_output with cmd: ['exim -bp', '|', 'exiqsumm']
2023-11-30 12:38:41 CET | CORE | DEBUG | (pkg/collector/python/datadog_agent.go:135 in LogMessage) | exim:c933403ba680af29 | (subprocess_output.py:59) | get_subprocess_output returned (len(out): 0 ; len(err): 0 ; returncode: 0)

Describe the results you expected: Expected results with a Count of 680 as shown in command line execution above and a value of 1312000 for volume.

Additional information you deem important (e.g. issue happens only occasionally):

Some debugging was done on the code and as far as I see, it is an bug in the code. The metric is created by a command called here. exim/datadog_checks/exim/check.py

cw-Widad commented 11 months ago

@JeanFred

JeanFred commented 11 months ago

Hi @cw-Widad, thanks for the details bug report. I’m afraid that since #1326, we have stopped using exim at $dayJob so I won‘t be able to work on this nor test any fix :-/

Regarding your notes:

cw-Widad commented 11 months ago

Thanks for your reply @JeanFred

I am not sure if piped commands is the way to go here, since Datadog documentation suggests not using subprocess python module directly (From the documentaion here : Since the Python interpreter that runs the checks is embedded in the multi-threaded Go runtime, using the subprocess or multithreading modules from the Python standard library is not supported in Agent version 6 and later.) Unfortunately this means that exiqsumm won't be usable at all because exiqsumm only works on piped inputs.

I have tested using the exim -bpc command for just getting the total messages in queue, I will try rewriting the code for exim.queue.count using this and create a PR.