open-telemetry / opentelemetry-ebpf-profiler

The production-scale datacenter profiler (C/C++, Go, Rust, Python, Java, NodeJS, .NET, PHP, Ruby, Perl, ...)
Apache License 2.0
2.43k stars 265 forks source link

devfiler not getting data? `Request failed: rpc error: code = Unimplemented desc = unsupported build ID kind received` #163

Closed tmm1 closed 1 month ago

tmm1 commented 1 month ago

following the readme to build the profiler off main, use ssh tunnel with devfiler 0.6.0 running on mac arm64. i see:

sudo ./opentelemetry-ebpf-profiler -collection-agent=127.0.0.1:11000 -disable-tls
INFO[0000] Starting OTEL profiling agent  (revision main-42ec3c1e, build timestamp 1726909385)
INFO[0000] Interpreter tracers: perl,php,python,hotspot,ruby,v8,dotnet
INFO[0000] Found offsets: task stack 0x18, pt_regs 0x3f58, tpbase 0x22e8
INFO[0000] Supports generic eBPF map batch operations
INFO[0000] eBPF tracer loaded
INFO[0003] Attached tracer program
INFO[0003] Attached sched monitor
ERRO[0005] Request failed: rpc error: code = Unimplemented desc = unsupported build ID kind received
ERRO[0010] Request failed: rpc error: code = Unimplemented desc = unsupported build ID kind received
ERRO[0015] Request failed: rpc error: code = Unimplemented desc = unsupported build ID kind received
tmm1 commented 1 month ago

i tried again with the pf-profiler linked in the devfiler UI. it gets further but fails with a different repeating error:

$ sudo ./pf-host-agent-8.13.0-linux-x86_64/pf-host-agent -project-id=123 -secret-token=123 -collection-agent=127.0.0.1:11000 -disable-tls
time="2024-09-21T09:26:08.774494478Z" level=info msg="Starting Prodfiler Host Agent 8.13.0 (revision head-063485b5, build timestamp 1711096243)"
time="2024-09-21T09:26:08.852994033Z" level=info msg="Interpreter tracers: perl,php,python,hotspot,ruby,v8"
time="2024-09-21T09:26:08.853017128Z" level=info msg="Automatically determining environment and machine ID ..."
time="2024-09-21T09:26:08.854776037Z" level=info msg="Environment: aws, machine ID: 0xdd0f09082ac2c40d"
time="2024-09-21T09:26:08.854793501Z" level=info msg="Assigned ProjectID: 123 HostID: 9011431310073447437"
time="2024-09-21T09:26:09.177215357Z" level=info msg="Start CPU metrics"
time="2024-09-21T09:26:09.177280862Z" level=info msg="Start I/O metrics"
time="2024-09-21T09:26:09.346211150Z" level=info msg="Found offsets: task stack 0x18, pt_regs 0x3f58, tpbase 0x22e8"
time="2024-09-21T09:26:09.346438357Z" level=info msg="Supports generic eBPF map batch operations"
time="2024-09-21T09:26:09.347284734Z" level=info msg="eBPF tracer loaded"
time="2024-09-21T09:26:18.988308000Z" level=info msg="Attached tracer program"
time="2024-09-21T09:26:19.007522682Z" level=info msg="Attached sched monitor"
time="2024-09-21T09:26:19.008503849Z" level=info msg="Environment variable KUBERNETES_SERVICE_HOST not set"
time="2024-09-21T09:26:19.759179514Z" level=warning msg="Failed to determine container info for trace: failed to find matching container metadata for containerID, 11:blkio:/ecs/ced2d00b2ab8416d9992c7b1be85e43a"
time="2024-09-21T09:26:20.758675070Z" level=warning msg="Failed to determine container info for trace: failed to find matching container metadata for containerID, 11:blkio:/ecs/4e4ef8eb20cb4c568e9765d48bed31e1"
time="2024-09-21T09:26:21.759074531Z" level=warning msg="Failed to determine container info for trace: failed to find matching container metadata for containerID, 11:blkio:/ecs/ced2d00b2ab8416d9992c7b1be85e43a"
time="2024-09-21T09:26:22.758566913Z" level=warning msg="Failed to determine container info for trace: failed to find matching container metadata for containerID, 11:blkio:/ecs/4e4ef8eb20cb4c568e9765d48bed31e1"
time="2024-09-21T09:26:23.009743397Z" level=warning msg="Failed to determine container info for trace: failed to find matching container metadata for containerID, 11:blkio:/ecs/ced2d00b2ab8416d9992c7b1be85e43a"
florianl commented 1 month ago

Sorry for the inconvenient :pray: https://github.com/open-telemetry/opentelemetry-ebpf-profiler/pull/153 got merged without providing an updated version of devfiler.

If you check out and build the agent on commit https://github.com/open-telemetry/opentelemetry-ebpf-profiler/commit/7d2285e14767c7abf4cdbe0927bf7857d1037076 (before https://github.com/open-telemetry/opentelemetry-ebpf-profiler/pull/153 got merged), then devfiler and the OTel Profiling agent work nicely together. There is no change in functionality but jut the way data is reported. Hope this helps?

I'm trying to provide next week a new version of devfiler, that works with the changes made in https://github.com/open-telemetry/opentelemetry-ebpf-profiler/pull/153.

tmm1 commented 1 month ago

Thanks, that worked! This is really quite an incredible tool.

Is there any way I can facilitate getting binaries out of docker containers for debug symbol analysis, on the machine itself? Vs pulling them to my local machine and dropping into devfiler?

florianl commented 1 month ago

With https://github.com/open-telemetry/opentelemetry-ebpf-profiler/pull/165 a new version of devfiler will be provided that can handle the changes made in https://github.com/open-telemetry/opentelemetry-ebpf-profiler/pull/153.

Is there any way I can facilitate getting binaries out of docker containers for debug symbol analysis, on the machine itself? Vs pulling them to my local machine and dropping into devfiler?

Sorry for this inconvenience. The OTel Profiling WG is working/discussing a protocol to upload symbols so this can happen in a more automated way. Unfortunately there is still work to do.