DataDog / datadog-agent

Main repository for Datadog Agent
https://docs.datadoghq.com/
Apache License 2.0
2.89k stars 1.21k forks source link

error while loading shared libraries: libbcc.so.0 #7268

Open strowk opened 3 years ago

strowk commented 3 years ago

Output of the info page (if this is a bug)

Getting the status from the agent.

===============
Agent (v7.25.0)
===============

  Status date: 2021-01-25 17:49:32.105537 UTC
  Agent start: 2021-01-25 17:32:41.040343 UTC
  Pid: 391
  Go Version: go1.14.12
  Python Version: 3.8.5
  Build arch: amd64
  Agent flavor: agent
  Check Runners: 5
  Log Level: info

  Paths
  =====
    Config File: /etc/datadog-agent/datadog.yaml
    conf.d: /etc/datadog-agent/conf.d
    checks.d: /etc/datadog-agent/checks.d

  Clocks
  ======
    System UTC time: 2021-01-25 17:49:32.105537 UTC

  Host Info
  =========
    bootTime: 2021-01-25 09:46:49.000000 UTC
    kernelArch: x86_64
    kernelVersion: 3.10.0-1127.19.1.el7.x86_64
    os: linux
    platform: debian
    platformFamily: debian
    platformVersion: bullseye/sid
    procs: 14
    uptime: 7h46m3s
    virtualizationRole: guest
    virtualizationSystem: docker

  Hostnames
  =========
    ec2-hostname: lp00osp07c002.bmwgroup.net
    hostname: lp00osp07c002
    instance-id: i-00000ac2
    socket-fqdn: tsr-icc-70-hjxw6
    socket-hostname: tsr-icc-70-hjxw6
    hostname provider: configuration

  Metadata
  ========
    hostname_source: configuration

=========
Collector
=========

  Running Checks
  ==============

    cpu
    ---
      Instance ID: cpu [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/cpu.d/conf.yaml.default
      Total Runs: 67
      Metric Samples: Last Run: 8, Total: 530
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 0s
      Last Execution Date : 2021-01-25 17:49:21.000000 UTC
      Last Successful Execution Date : 2021-01-25 17:49:21.000000 UTC

    disk (4.0.0)
    ------------
      Instance ID: disk:e5dffb8bef24336f [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/disk.d/conf.yaml.default
      Total Runs: 67
      Metric Samples: Last Run: 592, Total: 39,462
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 59ms
      Last Execution Date : 2021-01-25 17:49:28.000000 UTC
      Last Successful Execution Date : 2021-01-25 17:49:28.000000 UTC

    file_handle
    -----------
      Instance ID: file_handle [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/file_handle.d/conf.yaml.default
      Total Runs: 66
      Metric Samples: Last Run: 5, Total: 330
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 0s
      Last Execution Date : 2021-01-25 17:49:20.000000 UTC
      Last Successful Execution Date : 2021-01-25 17:49:20.000000 UTC

    io
    --
      Instance ID: io [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/io.d/conf.yaml.default
      Total Runs: 67
      Metric Samples: Last Run: 2,392, Total: 157,255
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 14ms
      Last Execution Date : 2021-01-25 17:49:27.000000 UTC
      Last Successful Execution Date : 2021-01-25 17:49:27.000000 UTC

    load
    ----
      Instance ID: load [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/load.d/conf.yaml.default
      Total Runs: 66
      Metric Samples: Last Run: 6, Total: 396
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 0s
      Last Execution Date : 2021-01-25 17:49:19.000000 UTC
      Last Successful Execution Date : 2021-01-25 17:49:19.000000 UTC

    memory
    ------
      Instance ID: memory [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/memory.d/conf.yaml.default
      Total Runs: 67
      Metric Samples: Last Run: 18, Total: 1,206
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 0s
      Last Execution Date : 2021-01-25 17:49:26.000000 UTC
      Last Successful Execution Date : 2021-01-25 17:49:26.000000 UTC

    ntp
    ---
      Instance ID: ntp:d884b5186b651429 [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/ntp.d/conf.yaml.default
      Total Runs: 2
      Metric Samples: Last Run: 0, Total: 0
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 1, Total: 2
      Average Execution Time : 5.005s
      Last Execution Date : 2021-01-25 17:47:56.000000 UTC
      Last Successful Execution Date : 2021-01-25 17:47:56.000000 UTC

    uptime
    ------
      Instance ID: uptime [OK]
      Configuration Source: file:/etc/datadog-agent/conf.d/uptime.d/conf.yaml.default
      Total Runs: 66
      Metric Samples: Last Run: 1, Total: 66
      Events: Last Run: 0, Total: 0
      Service Checks: Last Run: 0, Total: 0
      Average Execution Time : 0s
      Last Execution Date : 2021-01-25 17:49:18.000000 UTC
      Last Successful Execution Date : 2021-01-25 17:49:18.000000 UTC

========
JMXFetch
========

  Information
  ==================
    runtime_version : 11.0.10-ea
    version : 0.41.0
  Initialized checks
  ==================
    jmx
      instance_name : jmx-127.0.0.1-9010
      message : <no value>
      metric_count : 191
      service_check_count : 0
      status : OK
  Failed checks
  =============
    no checks

=========
Forwarder
=========

  Transactions
  ============
    Deployments: 0
    Dropped: 0
    DroppedOnInput: 0
    Nodes: 0
    Pods: 0
    ReplicaSets: 0
    Requeued: 0
    Retried: 0
    RetryQueueSize: 0
    Services: 0

  Transaction Successes
  =====================
    Total number: 142
    Successes By Endpoint:
      check_run_v1: 67
      intake: 8
      series_v1: 67

  API Keys status
  ===============
    API key ending with 965bd: API Key valid

==========
Endpoints
==========
  https://app.datadoghq.eu - API Key ending with:
      - 965bd

==========
Logs Agent
==========

  Logs Agent is not running

=========
APM Agent
=========
  Status: Running
  Pid: 395
  Uptime: 1012 seconds
  Mem alloc: 39,399,720 bytes
  Hostname: lp00osp07c002
  Receiver: localhost:8126
  Endpoints:
    https://trace.agent.datadoghq.eu

  Receiver (previous minute)
  ==========================
    No traces received in the previous minute.
    Default priority sampling rate: 100.0%

  Writer (previous minute)
  ========================
    Traces: 0 payloads, 0 traces, 0 events, 0 bytes
    Stats: 0 payloads, 0 stats buckets, 0 bytes

=========
Aggregator
=========
  Checks Metric Sample: 200,181
  Dogstatsd Metric Sample: 121,674
  Event: 1
  Events Flushed: 1
  Number Of Flushes: 67
  Series Flushed: 239,824
  Service Check: 537
  Service Checks Flushed: 600

=========
DogStatsD
=========
  Event Packets: 0
  Event Parse Errors: 0
  Metric Packets: 121,673
  Metric Parse Errors: 99
  Service Check Packets: 67
  Service Check Parse Errors: 0
  Udp Bytes: 24,010,636
  Udp Packet Reading Errors: 0
  Udp Packets: 31,687
  Uds Bytes: 0
  Uds Origin Detection Errors: 0
  Uds Packet Reading Errors: 0
  Uds Packets: 0

Describe what happened:

We are using docker image with agent - datadog/agent:7-jmx as a sidecar in Openshift installation.

It constantly logs following messages:

starting system-probe
system-probe: error while loading shared libraries: libbcc.so.0: cannot open shared object file: No such file or directory
system-probe exited with code 127, signal 0, restarting in 2 seconds
starting system-probe

Describe what you expected: No error logs

Steps to reproduce the issue: Run agent using docker image datadog/agent:7-jmx on Openshift

Additional environment details (Operating System, Cloud provider, etc):

L3n41c commented 3 years ago

libbcc.so.0 is supposed to be shipped in the image at /opt/datadog-agent/embedded/lib/libbcc.so.0. The path to look for libraries is hard-coded as a RUNPATH in the system-probe binary.

Could you please run the following diagnostics:

Start a container (I took the same version as you):

docker run -ti --rm gcr.io/datadoghq/agent:7.25.0-jmx /bin/bash

Install readelf inside this container:

apt update && apt install -y binutils

Check the dynamic section of the system-probe binary:

readelf -d $(which system-probe)

The output should begin with:

Dynamic section at offset 0x3c77dd0 contains 27 entries:
  Tag        Type                         Name/Value
 0x0000000000000001 (NEEDED)             Shared library: [libpthread.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libbcc.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
 0x000000000000001d (RUNPATH)            Library runpath: [/opt/datadog-agent/embedded/lib]

Then, check that libbcc.so.0 can be found at the expected location:

ls /opt/datadog-agent/embedded/lib/libbcc.so.0
oe-hbk commented 3 years ago

I ran into the same issue, and it was because /opt/datadog-agent/embedded/lib/libbcc.so.128-UNKNOWN (the eventual symlink from libbcc.so.0) was only readable by root, and my container was running as a non-root user.

dwimsey commented 2 years ago

Is there any reason that these permissions haven't been fixed? All it needs is a chmod a+r /opt/datadog-agent/embedded/lib/* - the permissions in that directory don't really make any sense.

I'm going to need to wrap your image just to fix this problem so this works in my environment which seems silly considering this is a year old report