DataDog / datadog-agent

Main repository for Datadog Agent
https://docs.datadoghq.com/
Apache License 2.0
2.89k stars 1.21k forks source link

TCP Check doesn't gracefully fail #6459

Open Mnkras opened 4 years ago

Mnkras commented 4 years ago

Output of the info page (if this is a bug)

===============
Agent (v7.22.0)
===============

  Status date: 2020-09-26 19:42:08.763137 UTC
  Agent start: 2020-09-22 21:01:40.644554 UTC
  Pid: 2276
  Go Version: go1.13.11
  Python Version: 3.8.5
  Build arch: amd64
  Agent flavor: agent
  Check Runners: 4
  Log Level: info
<<SNIP>>
 Check Initialization Errors
  ===========================

      tcp_check (2.5.0)
      -----------------

      instance 0:

        could not invoke 'tcp_check' python check constructor. New constructor API returned:
Traceback (most recent call last):
  File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/tcp_check/tcp_check.py", line 73, in __init__
    self.resolve_ip()
  File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/tcp_check/tcp_check.py", line 79, in resolve_ip
    self.addr = socket.gethostbyname(self.url)
socket.gaierror: [Errno -3] Temporary failure in name resolution

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/tcp_check/tcp_check.py", line 76, in __init__
    raise ConfigurationError(msg)
datadog_checks.base.errors.ConfigurationError: URL: <<censored>> is not a correct IPv4, IPv6 or hostname
Deprecated constructor API returned:
__init__() got an unexpected keyword argument 'agentConfig'

      instance 1:

        could not invoke 'tcp_check' python check constructor. New constructor API returned:
Traceback (most recent call last):
  File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/tcp_check/tcp_check.py", line 73, in __init__
    self.resolve_ip()
  File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/tcp_check/tcp_check.py", line 79, in resolve_ip
    self.addr = socket.gethostbyname(self.url)
socket.gaierror: [Errno -3] Temporary failure in name resolution

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/tcp_check/tcp_check.py", line 76, in __init__
    raise ConfigurationError(msg)
datadog_checks.base.errors.ConfigurationError: URL: <<censored>> is not a correct IPv4, IPv6 or hostname
Deprecated constructor API returned:
__init__() got an unexpected keyword argument 'agentConfig'

      instance 2:

        could not invoke 'tcp_check' python check constructor. New constructor API returned:
Traceback (most recent call last):
  File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/tcp_check/tcp_check.py", line 73, in __init__
    self.resolve_ip()
  File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/tcp_check/tcp_check.py", line 79, in resolve_ip
    self.addr = socket.gethostbyname(self.url)
socket.gaierror: [Errno -3] Temporary failure in name resolution

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/tcp_check/tcp_check.py", line 76, in __init__
    raise ConfigurationError(msg)
datadog_checks.base.errors.ConfigurationError: URL: <<censored>> is not a correct IPv4, IPv6 or hostname
Deprecated constructor API returned:
__init__() got an unexpected keyword argument 'agentConfig'

      instance 3:

        could not invoke 'tcp_check' python check constructor. New constructor API returned:
Traceback (most recent call last):
  File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/tcp_check/tcp_check.py", line 73, in __init__
    self.resolve_ip()
  File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/tcp_check/tcp_check.py", line 79, in resolve_ip
    self.addr = socket.gethostbyname(self.url)
socket.gaierror: [Errno -3] Temporary failure in name resolution

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/tcp_check/tcp_check.py", line 76, in __init__
    raise ConfigurationError(msg)
datadog_checks.base.errors.ConfigurationError: URL: <<censored>> is not a correct IPv4, IPv6 or hostname
Deprecated constructor API returned:
__init__() got an unexpected keyword argument 'agentConfig'

      instance 4:

        could not invoke 'tcp_check' python check constructor. New constructor API returned:
Traceback (most recent call last):
  File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/tcp_check/tcp_check.py", line 73, in __init__
    self.resolve_ip()
  File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/tcp_check/tcp_check.py", line 79, in resolve_ip
    self.addr = socket.gethostbyname(self.url)
socket.gaierror: [Errno -3] Temporary failure in name resolution

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/tcp_check/tcp_check.py", line 76, in __init__
    raise ConfigurationError(msg)
datadog_checks.base.errors.ConfigurationError: URL: <<censored>>.com is not a correct IPv4, IPv6 or hostname
Deprecated constructor API returned:
__init__() got an unexpected keyword argument 'agentConfig'
  Loading Errors
  ==============
    tcp_check
    ---------
      Core Check Loader:
        Check tcp_check not found in Catalog

      JMX Check Loader:
        check is not a jmx check, or unable to determine if it's so

      Python Check Loader:
        could not configure check instance for python check tcp_check: could not invoke 'tcp_check' python check constructor. New constructor API returned:
Traceback (most recent call last):
  File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/tcp_check/tcp_check.py", line 73, in __init__
    self.resolve_ip()
  File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/tcp_check/tcp_check.py", line 79, in resolve_ip
    self.addr = socket.gethostbyname(self.url)
socket.gaierror: [Errno -3] Temporary failure in name resolution

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/tcp_check/tcp_check.py", line 76, in __init__
    raise ConfigurationError(msg)
datadog_checks.base.errors.ConfigurationError: URL: <<censored>>.com is not a correct IPv4, IPv6 or hostname
Deprecated constructor API returned:
__init__() got an unexpected keyword argument 'agentConfig'

Check Config:

init_config:

instances:

  - name: <<censored>>
    host: <<censored>>.com
    port: 443
    collect_response_time: true
    # 5 min
    min_collection_interval: 300

  - name: <<censored>>
    host: <<censored>>.com
    port: 443
    collect_response_time: true
    # 5 min
    min_collection_interval: 300

  - name: <<censored>>
    host: <<censored>>.com
    port: 80
    collect_response_time: true
    # 5 min
    min_collection_interval: 300

  - name: <<censored>>
    host: <<censored>>.com
    port: 80
    collect_response_time: true
    # 5 min
    min_collection_interval: 300

  - name: <<censored>>
    host: <<censored>>.com
    port: 443
    collect_response_time: true
    # 5 min
    min_collection_interval: 300

Describe what happened: TCP check is failing/crashed and isn't running (running datadog-agent check tcp_check works)

Describe what you expected: I would expect to have the failure reported up (it isn't, no checks are being published)

Steps to reproduce the issue: Im not sure how the host got into this state.

Additional environment details (Operating System, Cloud provider, etc): Debian 10, bare-metal IoT device

smarek commented 1 year ago

This just hit me now, bare metal Debian with Minikube on Docker. Do you know about solution to this?