DD_AGENT_VERSION="7.21.1" Has slow memory leak

Output of the info page (if this is a bug)

See below

Describe what happened:

The agents are slowly using up memory until they key killed by kubernetes for exceeding its resources:

    State:          Running
      Started:      Thu, 20 Aug 2020 18:13:50 +0100
    Last State:     Terminated
      Reason:       OOMKilled <<<<<<<
      Exit Code:    0

Screenshot 2020-08-24 at 13 24 51

Describe what you expected:

Memory usage should return to normal after bursty events.

Steps to reproduce the issue:

Not 100% sure other than just running it.

Additional environment details (Operating System, Cloud provider, etc):

Running as a kubernetes daemonset on GCP. We are also using the jmx enabled image.

root@datadog-agent-tnz8z:/# agent status
Getting the status from the agent.

===============
Agent (v7.21.1)
===============

  Status date: 2020-08-24 12:33:10.118895 UTC
  Agent start: 2020-08-12 17:28:27.899667 UTC
  Pid: 378
  Go Version: go1.13.11
  Python Version: 3.8.1
  Build arch: amd64
  Agent flavor: agent
  Check Runners: 6
  Log Level: info

  Paths
  =====
    Config File: /etc/datadog-agent/datadog.yaml
    conf.d: /etc/datadog-agent/conf.d
    checks.d: /etc/datadog-agent/checks.d

  Clocks
  ======
    NTP offset: -717µs
    System UTC time: 2020-08-24 12:33:10.118895 UTC

  Host Info
  =========
    bootTime: 2020-08-12 17:27:22.000000 UTC
    kernelArch: x86_64
    kernelVersion: 4.19.112+
    os: linux
    platform: debian
    platformFamily: debian
    platformVersion: bullseye/sid
    procs: 216
    uptime: 1m9s
    virtualizationRole: guest

  Hostnames
  =========
    host_aliases: [gke-prestaging-orch-000-orc-pre-m-cuv-f1706f3a-k82s.ne-prestaging-w80j gke-prestaging-orch-000-orc-pre-m-cuv-f1706f3a-k82s-prestaging-orch-0001]
    hostname: gke-prestaging-orch-000-orc-pre-m-cuv-f1706f3a-k82s.c.ne-prestaging-w80j.internal
    socket-fqdn: datadog-agent-tnz8z
    socket-hostname: datadog-agent-tnz8z
    host tags:
      environment:prestaging
      cluster-name:prestaging-orch-0001
      orchestra:prestaging-orch-0001
      kube_cluster_name:prestaging-orch-0001
      cluster_name:prestaging-orch-0001
      zone:europe-west1-b
      internal-hostname:gke-prestaging-orch-000-orc-pre-m-cuv-f1706f3a-k82s.c.ne-prestaging-w80j.internal
      instance-id:5432819570720741505
      project:ne-prestaging-w80j
      numeric_project_id:358219109911
      cluster-location:europe-west1
      cluster-name:prestaging-orch-0001
      cluster-uid:b2e753a5ae64fa17b6cd4f70ed9ac8ecdde08527bc7f5792142c408edcbf93d3
    hostname provider: gce
    unused hostname providers:
      configuration/environment: hostname is empty

  Metadata
  ========
    cloud_provider: GCP
    hostname_source: gce

=========
Collector
=========

  Running Checks
  ==============

    cpu
    ---
      Instance ID: cpu [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/cpu.d/conf.yaml.default
      Total Runs: 67,939
      Metric Samples: Last Run: 6, Total: 407,628
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 0s
      Last Execution Date : 2020-08-24 12:33:07.000000 UTC
      Last Successful Execution Date : 2020-08-24 12:33:07.000000 UTC

    disk (2.10.1)
    -------------
      Instance ID: disk:e5dffb8bef24336f [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/disk.d/conf.yaml.default
      Total Runs: 67,938
      Metric Samples: Last Run: 218, Total: 14,818,116
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 33ms
      Last Execution Date : 2020-08-24 12:32:59.000000 UTC
      Last Successful Execution Date : 2020-08-24 12:32:59.000000 UTC

    docker
    ------
      Instance ID: docker [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/docker.d/conf.yaml.default
      Total Runs: 67,938
      Metric Samples: Last Run: 940, Total: 61,552,946
      Events: Last Run: 0, Total: 5,650
      Service Checks: Last Run: 1, Total: 67,938
      Average Execution Time : 144ms
      Last Execution Date : 2020-08-24 12:33:06.000000 UTC
      Last Successful Execution Date : 2020-08-24 12:33:06.000000 UTC

    file_handle
    -----------
      Instance ID: file_handle [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/file_handle.d/conf.yaml.default
      Total Runs: 67,938
      Metric Samples: Last Run: 5, Total: 339,690
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 0s
      Last Execution Date : 2020-08-24 12:32:58.000000 UTC
      Last Successful Execution Date : 2020-08-24 12:32:58.000000 UTC

    haproxy (2.10.0)
    ----------------
      Instance ID: haproxy:dcec29f86281aa1c [OK]
      Configuration Source: kubelet:docker://927ed1a5c0419ad12dc0b354854b17cddc1ceced21e49832b91e59acdb2a86fb
      Total Runs: 22,430
      Metric Samples: Last Run: 362, Total: 8,110,754
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 94ms
      Last Execution Date : 2020-08-24 12:32:57.000000 UTC
      Last Successful Execution Date : 2020-08-24 12:32:57.000000 UTC

    io
    --
      Instance ID: io [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/io.d/conf.yaml.default
      Total Runs: 67,938
      Metric Samples: Last Run: 208, Total: 14,176,696
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 0s
      Last Execution Date : 2020-08-24 12:33:05.000000 UTC
      Last Successful Execution Date : 2020-08-24 12:33:05.000000 UTC

    kube_dns (2.4.1)
    ----------------
      Instance ID: kube_dns:9e2acb32d30599df [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/kube_dns.d/auto_conf.yaml
      Total Runs: 57,618
      Metric Samples: Last Run: 84, Total: 4,816,192
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 13ms
      Last Execution Date : 2020-08-24 12:33:06.000000 UTC
      Last Successful Execution Date : 2020-08-24 12:33:06.000000 UTC

    kubelet (4.1.1)
    ---------------
      Instance ID: kubelet:d884b5186b651429 [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/kubelet.d/conf.yaml.default
      Total Runs: 67,938
      Metric Samples: Last Run: 1,117, Total: 73,145,128
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 4, Total: 271,752
      Average Execution Time : 545ms
      Last Execution Date : 2020-08-24 12:32:58.000000 UTC
      Last Successful Execution Date : 2020-08-24 12:32:58.000000 UTC

    kubernetes_apiserver
    --------------------
      Instance ID: kubernetes_apiserver [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/kubernetes_apiserver.d/conf.yaml.default
      Total Runs: 67,938
      Metric Samples: Last Run: 0, Total: 0
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 0s
      Last Execution Date : 2020-08-24 12:33:04.000000 UTC
      Last Successful Execution Date : 2020-08-24 12:33:04.000000 UTC

    load
    ----
      Instance ID: load [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/load.d/conf.yaml.default
      Total Runs: 67,938
      Metric Samples: Last Run: 6, Total: 407,628
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 0s
      Last Execution Date : 2020-08-24 12:32:56.000000 UTC
      Last Successful Execution Date : 2020-08-24 12:32:56.000000 UTC

    memory
    ------
      Instance ID: memory [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/memory.d/conf.yaml.default
      Total Runs: 67,938
      Metric Samples: Last Run: 17, Total: 1,154,946
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 0s
      Last Execution Date : 2020-08-24 12:33:03.000000 UTC
      Last Successful Execution Date : 2020-08-24 12:33:03.000000 UTC

    neo4j (0.0.1)
    -------------
      Instance ID: neo4j:68addb1f3df3b3d5 [OK]
      Configuration Source: kubelet:docker://59afd97febdb96ff8e94afdedbb23a408a5b845dbd6bb2bb6dc6cf4dbdf32aa9
      Total Runs: 666
      Metric Samples: Last Run: 387, Total: 257,742
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 85ms
      Last Execution Date : 2020-08-24 12:33:03.000000 UTC
      Last Successful Execution Date : 2020-08-24 12:33:03.000000 UTC

      Instance ID: neo4j:9ee0fcd48bf6796a [OK]
      Configuration Source: kubelet:docker://45ce9e2b10266d031f1a5a612547531b69b1a18782a10da883ca0bc60da6f613
      Total Runs: 14
      Metric Samples: Last Run: 387, Total: 5,418
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 96ms
      Last Execution Date : 2020-08-24 12:33:08.000000 UTC
      Last Successful Execution Date : 2020-08-24 12:33:08.000000 UTC

    network (1.17.0)
    ----------------
      Instance ID: network:5c571333f400457d [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/network.d/conf.yaml.default
      Total Runs: 67,938
      Metric Samples: Last Run: 31, Total: 2,106,078
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 2ms
      Last Execution Date : 2020-08-24 12:32:55.000000 UTC
      Last Successful Execution Date : 2020-08-24 12:32:55.000000 UTC

    ntp
    ---
      Instance ID: ntp:d884b5186b651429 [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/ntp.d/conf.yaml.default
      Total Runs: 1,133
      Metric Samples: Last Run: 1, Total: 1,133
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 1, Total: 1,133
      Average Execution Time : 491ms
      Last Execution Date : 2020-08-24 12:28:37.000000 UTC
      Last Successful Execution Date : 2020-08-24 12:28:37.000000 UTC

    prometheus (3.3.0)
    ------------------
      Instance ID: prometheus:neo4joperator:82eb03b232f67a9 [OK]
      Configuration Source: kubelet:docker://0baab6bf231c9e6e9aa557a018bde4a3b5e64ee2a2c515fd4e6efd62896dfd7e
      Total Runs: 253
      Metric Samples: Last Run: 126, Total: 31,878
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 1, Total: 253
      Average Execution Time : 19ms
      Last Execution Date : 2020-08-24 12:33:09.000000 UTC
      Last Successful Execution Date : 2020-08-24 12:33:09.000000 UTC

    uptime
    ------
      Instance ID: uptime [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/uptime.d/conf.yaml.default
      Total Runs: 67,938
      Metric Samples: Last Run: 1, Total: 67,938
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 0s
      Last Execution Date : 2020-08-24 12:33:02.000000 UTC
      Last Successful Execution Date : 2020-08-24 12:33:02.000000 UTC

  Loading Errors
  ==============
    neo4j_enterprise
    ----------------
      Core Check Loader:
        Check neo4j_enterprise not found in Catalog

      JMX Check Loader:
        check is not a jmx check, or unable to determine if it's so

      Python Check Loader:
        unable to import module 'neo4j_enterprise': No module named 'neo4j_enterprise'

========
JMXFetch
========

  Initialized checks
  ==================
    jmx
      instance_name : jmx-10.8.2.133-3637
      message : <no value>
      metric_count : 27
      service_check_count : 0
      status : OK
      instance_name : jmx-10.8.2.183-3637
      message : <no value>
      metric_count : 27
      service_check_count : 0
      status : OK
  Failed checks
  =============
    no checks

=========
Forwarder
=========

  Transactions
  ============
    CheckRunsV1: 67,938
    Connections: 0
    Containers: 0
    Dropped: 0
    DroppedOnInput: 0
    Events: 0
    HostMetadata: 0
    IntakeV1: 8,591
    Metadata: 0
    Pods: 0
    Processes: 0
    RTContainers: 0
    RTProcesses: 0
    Requeued: 3
    Retried: 3
    RetryQueueSize: 0
    Series: 0
    ServiceChecks: 0
    SketchSeries: 0
    Success: 144,467
    TimeseriesV1: 67,938

  Transaction Errors
  ==================
    Total number: 3
    Errors By Type:

  HTTP Errors
  ==================
    Total number: 3
    HTTP Errors By Code:
      500: 3

  API Keys status
  ===============
    API key ending with 6fd1d: API Key valid

==========
Endpoints
==========
  https://app.datadoghq.com - API Key ending with:
      - 6fd1d

==========
Logs Agent
==========

  Logs Agent is not running

=========
APM Agent
=========
  Status: Running
  Pid: 382
  Uptime: 1.019082e+06 seconds
  Mem alloc: 16,785,536 bytes
  Hostname: gke-prestaging-orch-000-orc-pre-m-cuv-f1706f3a-k82s.c.ne-prestaging-w80j.internal
  Receiver: 0.0.0.0:8126
  Endpoints:
    https://trace.agent.datadoghq.com

  Receiver (previous minute)
  ==========================
    No traces received in the previous minute.
    Default priority sampling rate: 100.0%

  Writer (previous minute)
  ========================
    Traces: 0 payloads, 0 traces, 0 events, 0 bytes
    Stats: 0 payloads, 0 stats buckets, 0 bytes

=========
Aggregator
=========
  Checks Metric Sample: 241,872,305
  Dogstatsd Metric Sample: 14,466,832
  Event: 5,651
  Events Flushed: 5,651
  Number Of Flushes: 67,938
  Series Flushed: 231,732,595
  Service Check: 1,564,093
  Service Checks Flushed: 1,632,015

=========
DogStatsD
=========
  Event Packets: 0
  Event Parse Errors: 0
  Metric Packets: 14,466,831
  Metric Parse Errors: 0
  Service Check Packets: 132,952
  Service Check Parse Errors: 0
  Udp Bytes: 5,120,405,215
  Udp Packet Reading Errors: 0
  Udp Packets: 4,600,796
  Uds Bytes: 0
  Uds Origin Detection Errors: 0
  Uds Packet Reading Errors: 0
  Uds Packets: 0

root@datadog-agent-tnz8z:/#

DataDog / datadog-agent

DD_AGENT_VERSION="7.21.1" Has slow memory leak #6270