open-telemetry / opentelemetry-network

eBPF Collector
https://opentelemetry.io
Apache License 2.0
297 stars 47 forks source link

can't find libtinfo.so.6 when install kernel-collector #265

Open evanzhang87 opened 7 months ago

evanzhang87 commented 7 months ago

What happened?

Description

It doesn't work, when I try to install rpm packege

libtinfo.so.6(NCURSES6_TINFO_5.0.19991023)(64bit) is needed by opentelemetry-ebpf-kernel-collector-0.10.2-1.x86_64

Steps to Reproduce

rpm -ivh opentelemetry-ebpf-kernel-collector-0.10.2-1.x86_64.rpm

ll /usr/lib64  | grep libtinfo.so
lrwxrwxrwx.  1 root root       15 Feb 10  2022 libtinfo.so.6 -> libtinfo.so.6.2
-rwxr-xr-x.  1 root root   191616 Feb 10  2022 libtinfo.so.6.2

Expected Result

how to install rpm package?

Actual Result

eBPF Collector version

0.10.2-1.x86_64

Environment information

Environment

Almalinux 5.14.0-284.30.1.el9_2.x86_64

eBPF Collector configuration

No response

Log output

error: Failed dependencies:
    libtinfo.so.6(NCURSES6_TINFO_5.0.19991023)(64bit) is needed by opentelemetry-ebpf-kernel-collector-0.10.2-1.x86_64

Additional context

No response

yonch commented 7 months ago

Hi @evanzhang87. This should ideally work out of the box.. Can you verify your LD_LIBRARY_PATH contains the directory that holds the shared library? Or add it with

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/lib64

(NB: the command above might contain typos, have not tested)

evanzhang87 commented 7 months ago

It doesn't works. 😩

env | grep LD_LIBRARY_PATH
LD_LIBRARY_PATH=/usr/lib64
rpm -i opentelemetry-ebpf-kernel-collector-0.10.2-1.x86_64.rpm
error: Failed dependencies:
    libtinfo.so.6(NCURSES6_TINFO_5.0.19991023)(64bit) is needed by opentelemetry-ebpf-kernel-collector-0.10.2-1.x86_64
ldd -v /usr/lib64/libtinfo.so.6
    linux-vdso.so.1 (0x00007ffc655f1000)
    libc.so.6 => /usr/lib64/libc.so.6 (0x00007f036ca00000)
    /lib64/ld-linux-x86-64.so.2 (0x00007f036cc78000)

    Version information:
    /usr/lib64/libtinfo.so.6:
        libc.so.6 (GLIBC_2.3) => /usr/lib64/libc.so.6
        libc.so.6 (GLIBC_2.14) => /usr/lib64/libc.so.6
        libc.so.6 (GLIBC_2.33) => /usr/lib64/libc.so.6
        libc.so.6 (GLIBC_2.16) => /usr/lib64/libc.so.6
        libc.so.6 (GLIBC_2.4) => /usr/lib64/libc.so.6
        libc.so.6 (GLIBC_2.3.4) => /usr/lib64/libc.so.6
        libc.so.6 (GLIBC_2.2.5) => /usr/lib64/libc.so.6
    /usr/lib64/libc.so.6:
        ld-linux-x86-64.so.2 (GLIBC_2.2.5) => /lib64/ld-linux-x86-64.so.2
        ld-linux-x86-64.so.2 (GLIBC_2.3) => /lib64/ld-linux-x86-64.so.2
        ld-linux-x86-64.so.2 (GLIBC_PRIVATE) => /lib64/ld-linux-x86-64.so.2

Maybe my lib has some problems?

yonch commented 7 months ago

I'm relying on some web searches so this might be completely wrong. But this might be caused by your libtinfo not containing NCURSES6 info. It seems that:

From this comment:

Most Linux distributions have standardized on providing libtinfo.so.6 (either directly or as a symlink to libncursesw.so.6).

If you don't have libncurses, maybe installing it would help (sudo apt-get install libncurses6). If it's installed, maybe symlinking libtinfo as in the comment above might help (you probably want to make the change reversible as you test.

Another option is to build the RPM for AlmaLinux specifically, happy to review a PR for that.

evanzhang87 commented 7 months ago

I try to run with docker, it reports a new error.

CONTAINER ID  IMAGE                                              COMMAND               CREATED        STATUS                      PORTS       NAMES
98d2632cbe34  docker.io/otel/opentelemetry-ebpf-reducer:v0.10.2  --port 8000 --pro...  6 seconds ago  Exited (139) 5 seconds ago              reducer
[root@iZuf68ovynt5bsqnddfe9wZ ~]# docker logs 98d2632cbe34
Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
+ [[ ! -e ./debug-info.conf ]]
+ install_dir=/srv
+ reducer=/srv/opentelemetry-ebpf-reducer
+ data_dir=/var/run/ebpf_net
+ dump_dir=/var/run/ebpf_net/dump
+ mkdir -p /var/run/ebpf_net /var/run/ebpf_net/dump
+ '[' -n '' ']'
+ '[' -n '' ']'
+ exec /srv/opentelemetry-ebpf-reducer --port 8000 --prom 0.0.0.0:7010
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
[root@iZuf68ovynt5bsqnddfe9wZ ~]# uname -r
5.14.0-70.17.1.el9_0.x86_64
[root@iZuf68ovynt5bsqnddfe9wZ ~]# cat /proc/version
Linux version 5.14.0-70.17.1.el9_0.x86_64 (mockbuild@6ca68ce3fdd54b118cd3e494113faf16) (gcc (GCC) 11.2.1 20220127 (Red Hat 11.2.1-9), GNU ld version 2.35.2-17.el9) #1 SMP PREEMPT Tue Jun 28 14:55:40 EDT 2022
yonch commented 6 months ago

Looks like memory allocation failed. Are you running on a system with high memory pressure?

yonch commented 6 months ago

also just a note that it seems like running loaded libtinfo successfully from the container -- it would have failed earlier if there was a problem with libtinfo.

And another question -- seems like you ran the docker image of the reducer, not the kernel collector? asking because the original issue was with the kernel collector...

evanzhang87 commented 6 months ago

另外请注意,似乎从容器成功运行加载的 libtinfo —— 如果 libtinfo 出现问题,它会更早失败。

还有另一个问题——好像你运行的是reducer的docker镜像,而不是内核收集器?询问是因为最初的问题与内核收集器有关......

Kernel-collector works well with docker, and this is my memory usage.

[root@iZuf68ovynt5bsqnddfe9wZ ~]# docker ps -a
Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
CONTAINER ID  IMAGE                                                       COMMAND               CREATED        STATUS                           PORTS       NAMES
98d2632cbe34  docker.io/otel/opentelemetry-ebpf-reducer:v0.10.2           --port 8000 --pro...  2 weeks ago    Exited (139) About a minute ago              reducer
c27521ffed92  docker.io/otel/opentelemetry-ebpf-kernel-collector:v0.10.2                        7 seconds ago  Up 7 seconds                                 kernel-collector
[root@iZuf68ovynt5bsqnddfe9wZ ~]# free -m
               total        used        free      shared  buff/cache   available
Mem:            1774         491         453           2         830        1110
Swap:              0           0           0
[root@iZuf68ovynt5bsqnddfe9wZ ~]# docker logs reducer
Emulate Docker CLI using podman. Create /etc/containers/nodocker to quiet msg.
+ [[ ! -e ./debug-info.conf ]]
+ install_dir=/srv
+ reducer=/srv/opentelemetry-ebpf-reducer
+ data_dir=/var/run/ebpf_net
+ dump_dir=/var/run/ebpf_net/dump
+ mkdir -p /var/run/ebpf_net /var/run/ebpf_net/dump
+ '[' -n '' ']'
+ '[' -n '' ']'
+ exec /srv/opentelemetry-ebpf-reducer --port 8000 --prom 0.0.0.0:7010
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
+ [[ ! -e ./debug-info.conf ]]
+ install_dir=/srv
+ reducer=/srv/opentelemetry-ebpf-reducer
+ data_dir=/var/run/ebpf_net
+ dump_dir=/var/run/ebpf_net/dump
+ mkdir -p /var/run/ebpf_net /var/run/ebpf_net/dump
+ '[' -n '' ']'
+ '[' -n '' ']'
+ exec /srv/opentelemetry-ebpf-reducer --port 8000 --prom 0.0.0.0:7010
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc