pixie-io / pixie

Instant Kubernetes-Native Application Observability
https://px.dev
Apache License 2.0
5.58k stars 427 forks source link

vizier-pem not working on Bottlerocket and Amazon Linux 2023 ARM64 AMIs (Kernel 6.1) #1863

Open raulsh opened 7 months ago

raulsh commented 7 months ago

Describe the bug vizier-pem can't initialize BPF program in Kernel, because an undeclared function arch_ftrace_get_regs. the impact of this, of course, i'm only receiving a subset of metrics (CPU, mem), but not HTTP events.

but this doesn't happen in AMD64 version (with kernel 6.1 too).

We "fixed" it using AL2 ARM64 AMIs instead of Bottlerocket (because they have kernel 5.10).

To Reproduce Just use Bottlerocket ARM64 AMI with Kernel 6.1 version (in my case, the AMI for k8s 1.28).

Expected behavior Start BPF program in linux Kernel

Logs

I20240328 03:08:33.389560 1192152 env.cc:47] Started: /app/src/vizier/services/agent/pem/pem
Started external stacktrace collection signal processor thread
2024-03-28T03:08:33.389835821Z I20240328 03:08:33.389792 1192152 kernel_version.cc:199] Found Linux kernel version using .note section.
2024-03-28T03:08:33.389915187Z I20240328 03:08:33.389820 1192152 pem_main.cc:68] Pixie PEM. Version: v0.14.8+Distribution.ce1e3f3.202312122235.1.RELEASE.jenkins, id: 3b568e65-c3f0-4ade-b544-c31a8072285f, kernel version: 6.1.77
2024-03-28T03:08:33.389920783Z I20240328 03:08:33.389892 1192152 stirling.cc:958] Creating Stirling, registered sources: [process_stats, network_stats, jvm_stats, socket_tracer, perf_profiler, proc_exit_tracer, stirling_error]
2024-03-28T03:08:33.389999337Z I20240328 03:08:33.389926 1192152 system_info.cc:42] Location of proc: /proc
2024-03-28T03:08:33.390008625Z I20240328 03:08:33.389940 1192152 system_info.cc:43] Location of sysfs: /sys/fs
2024-03-28T03:08:33.390011038Z I20240328 03:08:33.389948 1192152 system_info.cc:44] Number of CPUs: 8
2024-03-28T03:08:33.390072747Z I20240328 03:08:33.390023 1192152 system_info.cc:35] /proc/version:
2024-03-28T03:08:33.390080353Z Linux version 6.1.77 (builder@buildkitsandbox) (aarch64-bottlerocket-linux-gnu-gcc (Buildroot 2022.11.1) 11.3.0, GNU ld (GNU Binutils) 2.38) #1 SMP Fri Feb 23 02:34:42 UTC 2024
2024-03-28T03:08:33.390181069Z I20240328 03:08:33.390136 1192152 system_info.cc:35] /host/etc/os-release:
NAME=Bottlerocket
2024-03-28T03:08:33.390186410Z ID=bottlerocket
2024-03-28T03:08:33.390188494Z VERSION="1.19.2 (aws-k8s-1.28)"
2024-03-28T03:08:33.390190447Z PRETTY_NAME="Bottlerocket OS 1.19.2 (aws-k8s-1.28)"
2024-03-28T03:08:33.390191899Z VARIANT_ID=aws-k8s-1.28
2024-03-28T03:08:33.390193393Z VERSION_ID=1.19.2
2024-03-28T03:08:33.390194878Z BUILD_ID=29cc92cc
2024-03-28T03:08:33.390196388Z HOME_URL="https://github.com/bottlerocket-os/bottlerocket"
2024-03-28T03:08:33.390197881Z SUPPORT_URL="https://github.com/bottlerocket-os/bottlerocket/discussions"
2024-03-28T03:08:33.390199382Z BUG_REPORT_URL="https://github.com/bottlerocket-os/bottlerocket/issues"
2024-03-28T03:08:33.390201401Z DOCUMENTATION_URL="https://bottlerocket.dev"
2024-03-28T03:08:33.390203821Z I20240328 03:08:33.390154 1192152 probe_cleaner.cc:102] Cleaning probes from /sys/kernel/debug/tracing/kprobe_events with the following marker: __pixie__
2024-03-28T03:08:33.390249170Z I20240328 03:08:33.390205 1192152 probe_cleaner.cc:117] All Stirling probes removed (count=0)
2024-03-28T03:08:33.390254708Z I20240328 03:08:33.390216 1192152 probe_cleaner.cc:102] Cleaning probes from /sys/kernel/debug/tracing/uprobe_events with the following marker: __pixie__
2024-03-28T03:08:33.390353373Z I20240328 03:08:33.390237 1192152 probe_cleaner.cc:117] All Stirling probes removed (count=0)
2024-03-28T03:08:33.390364474Z I20240328 03:08:33.390249 1192152 source_connector.cc:35] Initializing source connector: process_stats
2024-03-28T03:08:33.390366854Z I20240328 03:08:33.390255 1192152 stirling.cc:438] Adding info class: [process_stats/process_stats]
I20240328 03:08:33.390264 1192152 source_connector.cc:35] Initializing source connector: network_stats
2024-03-28T03:08:33.390373631Z I20240328 03:08:33.390270 1192152 stirling.cc:438] Adding info class: [network_stats/network_stats]
2024-03-28T03:08:33.390375461Z I20240328 03:08:33.390277 1192152 source_connector.cc:35] Initializing source connector: jvm_stats
2024-03-28T03:08:33.390377036Z I20240328 03:08:33.390283 1192152 stirling.cc:438] Adding info class: [jvm_stats/jvm_stats]
2024-03-28T03:08:33.390499224Z I20240328 03:08:33.390446 1192152 source_connector.cc:35] Initializing source connector: socket_tracer
2024-03-28T03:08:33.390524742Z I20240328 03:08:33.390475 1192152 kernel_version.cc:82] Obtained Linux version string from `uname`: 6.1.77
2024-03-28T03:08:33.390530789Z I20240328 03:08:33.390487 1192152 linux_headers.cc:385] Detected kernel release (uname -r): 6.1.77
2024-03-28T03:08:33.390722588Z I20240328 03:08:33.390666 1192152 linux_headers.cc:400] Not Found : Could not find 'source' or 'build' under /lib/modules/6.1.77.
2024-03-28T03:08:33.390728569Z I20240328 03:08:33.390683 1192152 linux_headers.cc:248] Looking for host Linux headers at /host/lib/modules/6.1.77/build.
2024-03-28T03:08:33.390782976Z I20240328 03:08:33.390735 1192152 linux_headers.cc:403] Not Found : Did not find host headers at path: /host/lib/modules/6.1.77/build.
2024-03-28T03:08:33.390790320Z I20240328 03:08:33.390748 1192152 linux_headers.cc:346] Attempting to install packaged headers.
2024-03-28T03:08:33.390863426Z W20240328 03:08:33.390803 1192152 linux_headers.cc:319] Ignoring /px/linux-headers-arm64-4.14.304.tar.gz since it does not conform to the naming convention
2024-03-28T03:08:33.390867865Z W20240328 03:08:33.390823 1192152 linux_headers.cc:319] Ignoring /px/linux-headers-arm64-4.19.271.tar.gz since it does not conform to the naming convention
2024-03-28T03:08:33.390901439Z I20240328 03:08:33.390862 1192152 linux_headers.cc:352] Using packaged header: /px/linux-headers-arm64-6.1.8.tar.gz
I20240328 03:08:33.896347 1192152 linux_headers.cc:56] Overriding linux version code to 393549
2024-03-28T03:08:33.897506829Z I20240328 03:08:33.897437 1192152 kernel_version.cc:82] Obtained Linux version string from `uname`: 6.1.77
2024-03-28T03:08:33.897515321Z I20240328 03:08:33.897487 1192152 linux_headers.cc:98] Found kernel config at: /proc/config.gz.
2024-03-28T03:08:33.903930339Z I20240328 03:08:33.903785 1192152 linux_headers.cc:377] Successfully installed packaged copy of headers at /lib/modules/6.1.77/build
2024-03-28T03:08:33.903944681Z I20240328 03:08:33.903820 1192152 bcc_wrapper.cc:94] Resolving task_struct offsets.
2024-03-28T03:08:33.936525754Z I20240328 03:08:33.936362 1192200 bcc_wrapper.cc:166] Initializing BPF program ...
2024-03-28T03:08:35.138079010Z I20240328 03:08:35.137908 1192200 scoped_timer.h:48] Timer(init_bpf_program) : 1.20 s
cannot create /var/tmp/bcc
2024-03-28T03:08:35.138326029Z WARNING: cannot get prog tag, ignore saving source with program tag
2024-03-28T03:08:35.286392704Z E20240328 03:08:35.286208 1192200 task_struct_resolver.cc:330] Internal : Failed to find the thread_struct offset within the task_struct. This is required for resolving task struct offsets on aarch64
2024-03-28T03:08:35.320800963Z I20240328 03:08:35.320627 1192259 bcc_wrapper.cc:166] Initializing BPF program ...
2024-03-28T03:08:36.526903742Z I20240328 03:08:36.526736 1192259 scoped_timer.h:48] Timer(init_bpf_program) : 1.21 s
2024-03-28T03:08:36.527156299Z cannot create /var/tmp/bcc
2024-03-28T03:08:36.527176992Z WARNING: cannot get prog tag, ignore saving source with program tag
2024-03-28T03:08:36.736305806Z E20240328 03:08:36.736131 1192259 task_struct_resolver.cc:330] Internal : Failed to find the thread_struct offset within the task_struct. This is required for resolving task struct offsets on aarch64
2024-03-28T03:08:36.770823503Z I20240328 03:08:36.770673 1192260 bcc_wrapper.cc:166] Initializing BPF program ...
2024-03-28T03:08:38.009211353Z I20240328 03:08:38.009027 1192260 scoped_timer.h:48] Timer(init_bpf_program) : 1.24 s
2024-03-28T03:08:38.009440354Z cannot create /var/tmp/bcc
2024-03-28T03:08:38.009457970Z WARNING: cannot get prog tag, ignore saving source with program tag
2024-03-28T03:08:38.136286033Z E20240328 03:08:38.136122 1192260 task_struct_resolver.cc:330] Internal : Failed to find the thread_struct offset within the task_struct. This is required for resolving task struct offsets on aarch64
2024-03-28T03:08:38.136346135Z W20240328 03:08:38.136240 1192152 bcc_wrapper.cc:149] Failed to obtain task_struct offsets, will not override the task_struct offsets, error: Internal : Resolution failed in subprocess. Check subprocess logs for the error.
2024-03-28T03:08:38.136424205Z I20240328 03:08:38.136349 1192152 bcc_wrapper.cc:166] Initializing BPF program ...
2024-03-28T03:08:40.084685379Z In file included from s
rc/stirling/source_connectors/socket_tracer/bcc_bpf/socket_trace.c:27:
2024-03-28T03:08:40.084722704Z In file included from src/stirling/bpf_tools/bcc_bpf/system-headers/net/inet_sock.h:1:
2024-03-28T03:08:40.084726782Z In file included from include/net/inet_sock.h:19:
2024-03-28T03:08:40.084728931Z In file included from include/linux/netdevice.h:38:
2024-03-28T03:08:40.084730901Z In file included from include/net/net_namespace.h:43:
2024-03-28T03:08:40.084733469Z In file included from include/linux/skbuff.h:17:
2024-03-28T03:08:40.084735610Z In file included from include/linux/bvec.h:10:
2024-03-28T03:08:40.084737604Z In file included from include/linux/highmem.h:8:
2024-03-28T03:08:40.084739491Z In file included from include/linux/cacheflush.h:5:
2024-03-28T03:08:40.084741764Z In file included from arch/arm64/include/asm/cacheflush.h:11:
2024-03-28T03:08:40.084743856Z In file included from include/linux/kgdb.h:19:
2024-03-28T03:08:40.084746104Z In file included from include/linux/kprobes.h:28:
2024-03-28T03:08:40.084749075Z include/linux/ftrace.h:126:9: warning: call to undeclared function 'arch_ftrace_get_regs'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
        return arch_ftrace_get_regs(fregs);
2024-03-28T03:08:40.084754391Z                ^
2024-03-28T03:08:40.084768832Z include/linux/ftrace.h:126:9: error: incompatible integer to pointer conversion returning 'int' from a function with result type 'struct pt_regs *' [-Wint-conversion]
2024-03-28T03:08:40.084787859Z         return arch_ftrace_get_regs(fregs);
2024-03-28T03:08:40.084811547Z                ^~~~~~~~~~~~~~~~~~~~~~~~~~~
2024-03-28T03:08:40.731557619Z 1 warning and 1 error generated.
2024-03-28T03:08:40.735972614Z I20240328 03:08:40.735817 1192152 scoped_timer.h:48] Timer(init_bpf_program) : 2.60 s
2024-03-28T03:08:40.736077908Z W20240328 03:08:40.735991 1192152 stirling.cc:416] Source Connector (registry name=socket_tracer) not instantiated, error: Internal : Unable to initialize BCC BPF program: Unable to initialize BPF program
2024-03-28T03:08:40.736132463Z I20240328 03:08:40.736056 1192152 source_connector.cc:35] Initializing source connector: perf_profiler
2024-03-28T03:08:40.736139076Z I20240328 03:08:40.736104 1192152 kernel_version.cc:82] Obtained Linux version string from `uname`: 6.1.77
2024-03-28T03:08:40.736144647Z I20240328 03:08:40.736111 1192152 linux_headers.cc:385] Detected kernel release (uname -r): 6.1.77
2024-03-28T03:08:40.736191555Z I20240328 03:08:40.736146 1192152 linux_headers.cc:206] Using Linux headers from: /lib/modules/6.1.77/build.
2024-03-28T03:08:40.736195059Z I20240328 03:08:40.736160 1192152 bcc_wrapper.cc:94] Resolving task_struct offsets.
2024-03-28T03:08:40.772143863Z I20240328 03:08:40.771977 1192329 bcc_wrapper.cc:166] Initializing BPF program ...
2024-03-28T03:08:41.997959608Z I20240328 03:08:41.997843 1192329 scoped_timer.h:48] Timer(init_bpf_program) : 1.23 s
2024-03-28T03:08:41.998298481Z cannot create /var/tmp/bcc
2024-03-28T03:08:41.998306932Z WARNING: cannot get prog tag, ignore saving source with program tag
2024-03-28T03:08:42.146374174Z E20240328 03:08:42.146178 1192329 task_struct_resolver.cc:330] Internal : Failed to f
ind the thread_struct offset within the task_struct. This is required for resolving task struct offsets on aarch64
I20240328 03:08:42.184947 1192615 bcc_wrapper.cc:166] Initializing BPF program ...
2024-03-28T03:08:43.417662229Z I20240328 03:08:43.417484 1192615 scoped_timer.h:48] Timer(init_bpf_program) : 1.23 s
2024-03-28T03:08:43.417875599Z cannot create /var/tmp/bcc
2024-03-28T03:08:43.417882204Z WARNING: cannot get prog tag, ignore saving source with program tag
2024-03-28T03:08:43.606388640Z E20240328 03:08:43.606211 1192615 task_struct_resolver.cc:330] Internal : Failed to find the thread_struct offset within the task_struct. This is required for resolving task struct offsets on aarch64
2024-03-28T03:08:43.646699600Z I20240328 03:08:43.646538 1192621 bcc_wrapper.cc:166] Initializing BPF program ...
2024-03-28T03:08:44.889785336Z I20240328 03:08:44.889617 1192621 scoped_timer.h:48] Timer(init_bpf_program) : 1.24 s
2024-03-28T03:08:44.890004826Z cannot create /var/tmp/bcc
2024-03-28T03:08:44.890022106Z WARNING: cannot get prog tag, ignore saving source with program tag
E20240328 03:08:45.096653 1192621 task_struct_resolver.cc:330] Internal : Failed to find the thread_struct offset within the task_struct. This is required for resolving task struct offsets on aarch64
W20240328 03:08:45.096807 1192152 bcc_wrapper.cc:149] Failed to obtain task_struct offsets, will not override the task_struct offsets, error: Internal : Resolution failed in subprocess. Check subprocess logs for the error.
2024-03-28T03:08:45.097021407Z I20240328 03:08:45.096947 1192152 bcc_wrapper.cc:166] Initializing BPF program ...
2024-03-28T03:08:46.337326214Z I20240328 03:08:46.337155 1192152 scoped_timer.h:48] Timer(init_bpf_program) : 1.24 s
2024-03-28T03:08:46.337714997Z cannot create /var/tmp/bcc
2024-03-28T03:08:46.337740769Z WARNING: cannot get prog tag, ignore saving source with program tag
2024-03-28T03:08:46.338500810Z I20240328 03:08:46.338413 1192152 perf_profile_connector.cc:153] PerfProfiler: Stack trace profiling sampling probe successfully deployed.
2024-03-28T03:08:46.338513405Z I20240328 03:08:46.338440 1192152 perf_profile_connector.cc:169] PerfProfiler: Java symbolization enabled.
2024-03-28T03:08:46.338517072Z I20240328 03:08:46.338470 1192152 java_symbolizer.cc:233] JavaSymbolizer found agent lib /px/libpx-java-agent.so.
2024-03-28T03:08:46.338528034Z I20240328 03:08:46.338482 1192152 stirling.cc:438] Adding info class: [perf_profiler/stack_traces.beta]
2024-03-28T03:08:46.338600459Z I20240328 03:08:46.338544 1192152 source_connector.cc:35] Initializing source connector: proc_exit_tracer
2024-03-28T03:08:46.338611298Z I20240328 03:08:46.338562 1192152 kernel_version.cc:82] Obtained Linux version string from `uname`: 6.1.77
2024-03-28T03:08:46.338614071Z I20240328 03:08:46.338572 1192152 linux_headers.cc:385] Detected kernel release (uname -r): 6.1.77
2024-03-28T03:08:46.338658715Z I20240328 03:08:46.338598 1192152 linux_headers.cc:206] Using Linux headers from: /lib/modules/6.1.77/build.
2024-03-28T03:08:46.338663153Z I20240328 03:08:46.338611 1192152 bcc_wrapper.cc:94] Resolving task_struct offsets.
2024-03-28T03:08:46.376285916Z I20240328 03:08:46.376114 1192688 bcc_wrapper.cc:166] Initializing BPF program ...
2024-03-28T03:08:47.605723517Z I20240328 03:
08:47.605566 1192688 scoped_timer.h:48] Timer(init_bpf_program) : 1.23 s
2024-03-28T03:08:47.606021619Z cannot create /var/tmp/bcc
2024-03-28T03:08:47.606037495Z WARNING: cannot get prog tag, ignore saving source with program tag
2024-03-28T03:08:47.746417631Z E20240328 03:08:47.746261 1192688 task_struct_resolver.cc:330] Internal : Failed to find the thread_struct offset within the task_struct. This is required for resolving task struct offsets on aarch64
2024-03-28T03:08:47.784790072Z I20240328 03:08:47.784610 1192748 bcc_wrapper.cc:166] Initializing BPF program ...
2024-03-28T03:08:49.028049991Z I20240328 03:08:49.027859 1192748 scoped_timer.h:48] Timer(init_bpf_program) : 1.24 s
2024-03-28T03:08:49.028348897Z cannot create /var/tmp/bcc
2024-03-28T03:08:49.028355560Z WARNING: cannot get prog tag, ignore saving source with program tag
2024-03-28T03:08:49.166417952Z E20240328 03:08:49.166254 1192748 task_struct_resolver.cc:330] Internal : Failed to find the thread_struct offset within the task_struct. This is required for resolving task struct offsets on aarch64
2024-03-28T03:08:49.205184362Z I20240328 03:08:49.205062 1192764 bcc_wrapper.cc:166] Initializing BPF program ...
2024-03-28T03:08:50.434613068Z I20240328 03:08:50.434444 1192764 scoped_timer.h:48] Timer(init_bpf_program) : 1.23 s
2024-03-28T03:08:50.434878088Z cannot create /var/tmp/bcc
2024-03-28T03:08:50.434895598Z WARNING: cannot get prog tag, ignore saving source with program tag
E20240328 03:08:50.596215 1192764 task_struct_resolver.cc:330] Internal : Failed to find the thread_struct offset within the task_struct. This is required for resolving task struct offsets on aarch64
2024-03-28T03:08:50.596449444Z W20240328 03:08:50.596338 1192152 bcc_wrapper.cc:149] Failed to obtain task_struct offsets, will not override the task_struct offsets, error: Internal : Resolution failed in subprocess. Check subprocess logs for the error.
2024-03-28T03:08:50.596530845Z I20240328 03:08:50.596442 1192152 bcc_wrapper.cc:166] Initializing BPF program ...
2024-03-28T03:08:51.744063067Z I20240328 03:08:51.743876 1192152 scoped_timer.h:48] Timer(init_bpf_program) : 1.15 s
2024-03-28T03:08:51.744278635Z cannot create /var/tmp/bcc
2024-03-28T03:08:51.744294397Z WARNING: cannot get prog tag, ignore saving source with program tag
2024-03-28T03:08:51.833872466Z I20240328 03:08:51.833709 1192152 stirling.cc:438] Adding info class: [proc_exit_tracer/proc_exit_events]
2024-03-28T03:08:51.833915591Z I20240328 03:08:51.833772 1192152 source_connector.cc:35] Initializing source connector: stirling_error
I20240328 03:08:51.833838 1192152 stirling.cc:438] Adding info class: [stirling_error/stirling_error]
2024-03-28T03:08:51.833933182Z I20240328 03:08:51.833853 1192152 stirling.cc:438] Adding info class: [stirling_error/probe_status]
2024-03-28T03:08:51.833936210Z I20240328 03:08:51.833863 1192152 stirling.cc:419] Stirling successfully initialized.
2024-03-28T03:08:51.834700780Z E0328 03:08:51.834622874 1192152 dns_resolver_ares.cc:456]             no server name supplied in dns URI
2024-03-28T03:08:51.834719340Z E0328 03:08:51.834659583 1192152 channel.cc:120]                       channel stack builder failed: UNKNOWN: the target uri is not valid: dns:///
2024-03-28T03:08:51.845240748Z I20240328 03:08:51.845096 1192152 manager.cc:154] Hostname: ip-10-2-32-225.ec2.internal
I20240328 03:09:12.812111 1192152 cgroup_path_resolver.cc:141] Auto-discovered CGroup base path: /sys/fs/cgroup
I20240328 03:09:12.824586 1192152 cgroup_path_resolver.cc:144] Auto-discovered example path: /sys/fs/cgroup/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-podf015ba52_02a9_41b8_b4bc_eaa0565812b2.slice/cri-containerd-8f9de4e9950683557c7f3d444c6380b9818bfd9c721e4cea134b883571541d63.scope/cgroup.procs
I20240328 03:09:12.824684 1192152 cgroup_path_resolver.cc:148] Auto-discovered template: /sys/fs/cgroup/kubepods.slice/kubepods-$2.slice/kubepods-$2-pod$0.slice/cri-containerd-$1.scope/cgroup.procs
2024-03-28T03:09:12.824795004Z I20240328 03:09:12.824697 1192152 cgroup_metadata_reader.cc:41] Using path_resolver with configuration: template=/sys/fs/cgroup/kubepods.slice/kubepods-$2.slice/kubepods-$2-pod$0.slice/cri-containerd-$1.scope/cgroup.procs pod_id_separators=_ qos_spearator=-
I20240328 03:09:12.825067 1193919 stirling.cc:792] Stirling is running.
I20240328 03:09:12.830222 1193919 perf_profile_connector.cc:428] PerfProfileConnector statistics: kBPFMapSwitchoverEvent=1 kCumulativeSumOfAllStackTraces=19260 kLossHistoEvent=0 
I20240328 03:09:12.830266 1193919 perf_profile_connector.cc:442] PerfProfileConnector u_symbolizer num_symbols_cached=0 hits=0 accesses=0 hit_rate=0
2024-03-28T03:09:12.830401334Z I20240328 03:09:12.830276 1193919 perf_profile_connector.cc:445] PerfProfileConnector k_symbolizer num_symbols_cached=0 hits=0 accesses=0 hit_rate=0
I20240328 03:09:12.930631 1193919 run_core_stats.cc:111] |main_loop_iters,no_work_iters,useful_iters,push+transfer,transfer,push,min_push+transfer,max_push+transfer,total_0.00_ms,total_0.01_ms,total_0.02_ms,total_0.03_ms,total_0.06_ms,total_0.10_ms,total_0.18_ms,total_0.32_ms,total_0.56_ms,total_1.00_ms,total_1.78_ms,total_3.16_ms,total_5.62_ms,total_10.00_ms,total_17.78_ms,total_31.62_ms,total_56.23_ms,total_100.00_ms,total_177.83_ms,total_316.23_ms,total_562.34_ms,total_1000.00_ms,total_10000000.00_ms,no_work_0.00_ms,no_work_0.01_ms,no_work_0.02_ms,no_work_0.03_ms,no_work_0.06_ms,no_work_0.10_ms,no_work_0.18_ms,no_work_0.32_ms,no_work_0.56_ms,no_work_1.00_ms,no_work_1.78_ms,no_work_3.16_ms,no_work_5.62_ms,no_work_10.00_ms,no_work_17.78_ms,no_work_31.62_ms,no_work_56.23_ms,no_work_100.00_ms,no_work_177.83_ms,no_work_316.23_ms,no_work_562.34_ms,no_work_1000.00_ms,no_work_10000000.00_ms
I20240328 03:09:19.995174 1192152 heartbeat.cc:143] Heartbeat ACK latency moving average: 5249 ms
I20240328 03:09:22.813768 1192152 metadata_state.h:123] Service CIDR updated to 172.20.0.0/16

App information (please complete the following information):

Additional context

AWS AMI: Bottlerocket ARM64 Kernel 6.1.77

ddelnano commented 1 week ago

This looks like it could be a kernel bug. This thread from the lkml mentions that it's possible to certain ftrace kernel configs that lead to undefined functions. That thread links to https://lore.kernel.org/all/202211212249.livTPi3Y-lkp@intel.com/, which has the same undeclared function 'arch_ftrace_get_regs' error.

I need to investigate further when that was merged upstream and to see what kernels versions are impacted from that bug.

davivcgarcia commented 1 week ago

This issue is also happening for AL23 AMIs for ARM64.

Linux ip-10-0-1-40.eu-central-1.compute.internal 6.1.109-118.189.amzn2023.x86_64 #1 SMP PREEMPT_DYNAMIC Tue Sep 10 08:59:12 UTC 2024 x86_64 GNU/Linux
ddelnano commented 1 week ago

I'm still in the process of investigating the Pixie fix for this, but I believe installing Bottlerocket's or AL's kernel-headers package should fix this issue in the meantime. Installing the upstream distro's kernel headers is preferred as it's guaranteed to match the kernel.

A note for myself for later: https://github.com/torvalds/linux/commit/26299b3f6ba26bfc234b73126d14bdf4dec5275a#diff-b621eca3cc730521fecf3f5632328c396ce6eab735845cfe6a2b88c85a1721a7R311 was backported to Amazon Linux 2023's 6.1 kernel. This was merged upstream in a Linux 6.2 rc. That change should fix the BPF program compilation, but it's unclear to me why CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS=y causes compilation issues when using 6.1.x upstream headers.