weaveworks / scope

Monitoring, visualisation & management for Docker & Kubernetes
https://www.weave.works/oss/scope/
Apache License 2.0
5.87k stars 712 forks source link

ebpf not working in k8s #2784

Closed rade closed 7 years ago

rade commented 7 years ago

A user's scope fails to enable eBPF tracing with

<probe> WARN: 2017/08/01 16:17:02.094195 Error setting up the eBPF tracker, falling back to proc scanning: cannot open kprobe_events: open /sys/kernel/debug/tracing/kprobe_events: no such file or directory

They are running 4.4.65-k8s, apparently as shipped by kops 1.5.3.

The scope ebpf blog post says that

eBPF should work on Linux version 4.4 and later when compiled with BPF features enabled (CONFIG_BPF=y, CONFIG_BPF_SYSCALL=y, etc.).

So is the above error an indication that the kernel has not been compiled with BPF? Or is something else not right?

And what does "etc" stand for?

@alban @2opremio any idea?

ajohnstone commented 7 years ago

@rade slight correction.

Its actually kubernetes v1.5.4 with kops.

root@ip-10-116-75-183:/home/admin# cat /boot/config-* | grep BPF
CONFIG_NETFILTER_XT_MATCH_BPF=m
CONFIG_NET_CLS_BPF=m
CONFIG_BPF_JIT=y
CONFIG_HAVE_BPF_JIT=y
# CONFIG_TEST_BPF is not set
CONFIG_BPF=y
CONFIG_BPF_SYSCALL=y
CONFIG_NETFILTER_XT_MATCH_BPF=m
CONFIG_NET_CLS_BPF=m
CONFIG_NET_ACT_BPF=m
CONFIG_BPF_JIT=y
CONFIG_HAVE_BPF_JIT=y
CONFIG_BPF_EVENTS=y
CONFIG_TEST_BPF=m
alban commented 7 years ago

I'd check the mounts. On my laptop, I have:

$ mount|grep tracefs
tracefs on /sys/kernel/debug/tracing type tracefs (rw,relatime,seclabel)
$ mount|grep debugfs
debugfs on /sys/kernel/debug type debugfs (rw,relatime,seclabel)

I don't know kops but there was a similar issue on Amazon Linux due to a kernel bug. related discussion: https://forums.aws.amazon.com/thread.jspa?messageID=753257&#753257

The workaround to the kernel bug was to make sure debugfs is mounted before starting the Docker daemon. In this order:

$ sudo mount -t debugfs debugfs /sys/kernel/debug
$ sudo /etc/init.d/docker start

Another thing to check would be if the kernel has kprobes. On my laptop I have:

$ cat /boot/config-`uname -r`|grep KPROBE
CONFIG_KPROBES=y
CONFIG_KPROBES_ON_FTRACE=y
CONFIG_HAVE_KPROBES=y
CONFIG_HAVE_KPROBES_ON_FTRACE=y
CONFIG_KPROBE_EVENTS=y
# CONFIG_KPROBES_SANITY_TEST is not set
rade commented 7 years ago

@ajohnstone could you run some of the checks alban suggested above?

ajohnstone commented 7 years ago

Hi, in the kubernetes manifests mount point /sys/kernel/debug Is empty

root@ip-10-116-71-191:/home/admin# mount | grep -E 'tracefs|debugfs'
debugfs on /sys/kernel/debug type debugfs (rw,relatime)
tracefs on /sys/kernel/debug/tracing type tracefs (rw,relatime)

root@ip-10-116-71-191:/home/admin#  cat /boot/config-`uname -r`|grep KPROBE
CONFIG_KPROBES=y
CONFIG_KPROBES_ON_FTRACE=y
CONFIG_HAVE_KPROBES=y
CONFIG_HAVE_KPROBES_ON_FTRACE=y
CONFIG_KPROBE_EVENT=y
# CONFIG_KPROBES_SANITY_TEST is not set

root@ip-10-116-71-191:/home/admin# docker run -v  /sys/kernel/debug:/sys/kernel/debug -it ubuntu ls -alh /sys/kernel/debug
total 0
drwx------ 21 root root 0 Jun 14 11:17 .
drwxr-xr-x 10 root root 0 Aug  2 13:23 ..
drwxr-xr-x  2 root root 0 Jun 14 11:17 acpi
drwxr-xr-x 11 root root 0 Jun 14 11:18 bdi
drwxr-xr-x  2 root root 0 Jun 14 11:18 btrfs
drwxr-xr-x  2 root root 0 Jun 14 11:17 clk
drwxr-xr-x  2 root root 0 Jun 14 11:17 dma_buf
drwxr-xr-x  4 root root 0 Jun 14 11:18 dri
drwxr-xr-x  2 root root 0 Jun 14 11:17 dynamic_debug
drwxr-xr-x  2 root root 0 Jun 14 11:17 extfrag
-rw-r--r--  1 root root 0 Jun 14 11:17 fault_around_bytes
drwxr-xr-x  2 root root 0 Jun 14 11:17 frontswap
-r--r--r--  1 root root 0 Jun 14 11:17 gpio
drwxr-xr-x  2 root root 0 Jul 18 21:36 ieee80211
drwxr-xr-x  2 root root 0 Jun 14 11:17 kprobes
drwxr-xr-x  2 root root 0 Jun 14 11:17 mce
drwxr-xr-x  2 root root 0 Jun 14 11:17 pinctrl
drwxr-xr-x  2 root root 0 Jun 14 11:17 pm_qos
drwxr-xr-x  2 root root 0 Jun 14 11:17 ras
drwxr-xr-x  2 root root 0 Jun 14 11:17 regmap
-rw-r--r--  1 root root 0 Jun 14 11:17 sched_features
-r--r--r--  1 root root 0 Jun 14 11:17 sleep_time
drwxr-xr-x  5 root root 0 Jun 14 11:18 sunrpc
-r--r--r--  1 root root 0 Jun 14 11:17 suspend_stats
drwx------  6 root root 0 Jun 14 11:17 tracing
-r--r--r--  1 root root 0 Jun 14 11:17 wakeup_sources
drwxr-xr-x  2 root root 0 Jun 14 11:17 x86
drwxr-xr-x  2 root root 0 Jun 14 11:17 zswap

root@ip-10-116-71-191:/home/admin# docker run  -it ubuntu ls -alh /sys/kernel/debug/
total 0
dr-xr-xr-x  2 root root 0 Aug  2 13:30 .
drwxr-xr-x 10 root root 0 Aug  2 13:23 ..
alban commented 7 years ago

@ajohnstone thanks for testing. It seems you have kprobes correctly enabled in the kernel.

Could you also check the content of the tracing directory?

On my laptop, the tracing directory is populated (with kprobe_events etc.) in all 3 places. With the bug I've seen before on Amazon Linux, tracing would only be populated on the host.

Does it work if you restart the docker daemon and containerd (without restarting the machine)?

ajohnstone commented 7 years ago

I've ran all the above and added a couple more tests below. Tbh I'm not sure what I should be looking for. Should /sys/kernel/debug/ be mounted from the host?

Host

root@ip-10-116-71-191:/home/admin# ls -l /sys/kernel/debug/tracing/
total 0
-r--r--r--  1 root root 0 Jun 14 11:17 available_events
-r--r--r--  1 root root 0 Jun 14 11:17 available_filter_functions
-r--r--r--  1 root root 0 Jun 14 11:17 available_tracers
-rw-r--r--  1 root root 0 Jun 14 11:17 buffer_size_kb
-r--r--r--  1 root root 0 Jun 14 11:17 buffer_total_size_kb
-rw-r--r--  1 root root 0 Jun 14 11:17 current_tracer
-r--r--r--  1 root root 0 Jun 14 11:17 dyn_ftrace_total_info
-r--r--r--  1 root root 0 Jun 14 11:17 enabled_functions
drwxr-xr-x 61 root root 0 Jun 14 11:17 events
--w-------  1 root root 0 Jun 14 11:17 free_buffer
drwxr-xr-x  2 root root 0 Jun 14 11:17 instances
-rw-r--r--  1 root root 0 Jun 14 11:17 kprobe_events
-r--r--r--  1 root root 0 Jun 14 11:17 kprobe_profile
-rw-r--r--  1 root root 0 Jun 14 11:17 max_graph_depth
drwxr-xr-x  2 root root 0 Jun 14 11:17 options
drwxr-xr-x 17 root root 0 Jun 14 11:17 per_cpu
-r--r--r--  1 root root 0 Jun 14 11:17 printk_formats
-r--r--r--  1 root root 0 Jun 14 11:17 README
-r--r--r--  1 root root 0 Jun 14 11:17 saved_cmdlines
-rw-r--r--  1 root root 0 Jun 14 11:17 saved_cmdlines_size
-rw-r--r--  1 root root 0 Jun 14 11:17 set_event
-rw-r--r--  1 root root 0 Jun 14 11:17 set_event_pid
-rw-r--r--  1 root root 0 Jun 14 11:17 set_ftrace_filter
-rw-r--r--  1 root root 0 Jun 14 11:17 set_ftrace_notrace
-rw-r--r--  1 root root 0 Jun 14 11:17 set_ftrace_pid
-r--r--r--  1 root root 0 Jun 14 11:17 set_graph_function
-r--r--r--  1 root root 0 Jun 14 11:17 set_graph_notrace
-rw-r--r--  1 root root 0 Jun 14 11:17 snapshot
-rw-r--r--  1 root root 0 Jun 14 11:17 stack_max_size
-r--r--r--  1 root root 0 Jun 14 11:17 stack_trace
-r--r--r--  1 root root 0 Jun 14 11:17 stack_trace_filter
-rw-r--r--  1 root root 0 Jun 14 11:17 trace
-rw-r--r--  1 root root 0 Jun 14 11:17 trace_clock
--w--w----  1 root root 0 Jun 14 11:17 trace_marker
-rw-r--r--  1 root root 0 Jun 14 11:17 trace_options
-r--r--r--  1 root root 0 Jun 14 11:17 trace_pipe
-rw-r--r--  1 root root 0 Jun 14 11:17 tracing_cpumask
-rw-r--r--  1 root root 0 Jun 14 11:17 tracing_max_latency
-rw-r--r--  1 root root 0 Jun 14 11:17 tracing_on
-rw-r--r--  1 root root 0 Jun 14 11:17 tracing_thresh
-rw-r--r--  1 root root 0 Jun 14 11:17 uprobe_events
-r--r--r--  1 root root 0 Jun 14 11:17 uprobe_profile

docker container debugfs

root@ip-10-116-71-191:/home/admin# docker run --rm -it -v /sys/kernel/debug:/sys/kernel/debug ubuntu ls -alh /sys/kernel/debug/tracing
total 0
drwx------  6 root root 0 Jun 14 11:17 .
drwx------ 21 root root 0 Jun 14 11:17 ..
-r--r--r--  1 root root 0 Jun 14 11:17 README
-r--r--r--  1 root root 0 Jun 14 11:17 available_events
-r--r--r--  1 root root 0 Jun 14 11:17 available_filter_functions
-r--r--r--  1 root root 0 Jun 14 11:17 available_tracers
-rw-r--r--  1 root root 0 Jun 14 11:17 buffer_size_kb
-r--r--r--  1 root root 0 Jun 14 11:17 buffer_total_size_kb
-rw-r--r--  1 root root 0 Jun 14 11:17 current_tracer
-r--r--r--  1 root root 0 Jun 14 11:17 dyn_ftrace_total_info
-r--r--r--  1 root root 0 Jun 14 11:17 enabled_functions
drwxr-xr-x 61 root root 0 Jun 14 11:17 events
--w-------  1 root root 0 Jun 14 11:17 free_buffer
drwxr-xr-x  2 root root 0 Jun 14 11:17 instances
-rw-r--r--  1 root root 0 Jun 14 11:17 kprobe_events
-r--r--r--  1 root root 0 Jun 14 11:17 kprobe_profile
-rw-r--r--  1 root root 0 Jun 14 11:17 max_graph_depth
drwxr-xr-x  2 root root 0 Jun 14 11:17 options
drwxr-xr-x 17 root root 0 Jun 14 11:17 per_cpu
-r--r--r--  1 root root 0 Jun 14 11:17 printk_formats
-r--r--r--  1 root root 0 Jun 14 11:17 saved_cmdlines
-rw-r--r--  1 root root 0 Jun 14 11:17 saved_cmdlines_size
-rw-r--r--  1 root root 0 Jun 14 11:17 set_event
-rw-r--r--  1 root root 0 Jun 14 11:17 set_event_pid
-rw-r--r--  1 root root 0 Jun 14 11:17 set_ftrace_filter
-rw-r--r--  1 root root 0 Jun 14 11:17 set_ftrace_notrace
-rw-r--r--  1 root root 0 Jun 14 11:17 set_ftrace_pid
-r--r--r--  1 root root 0 Jun 14 11:17 set_graph_function
-r--r--r--  1 root root 0 Jun 14 11:17 set_graph_notrace
-rw-r--r--  1 root root 0 Jun 14 11:17 snapshot
-rw-r--r--  1 root root 0 Jun 14 11:17 stack_max_size
-r--r--r--  1 root root 0 Jun 14 11:17 stack_trace
-r--r--r--  1 root root 0 Jun 14 11:17 stack_trace_filter
-rw-r--r--  1 root root 0 Jun 14 11:17 trace
-rw-r--r--  1 root root 0 Jun 14 11:17 trace_clock
--w--w----  1 root root 0 Jun 14 11:17 trace_marker
-rw-r--r--  1 root root 0 Jun 14 11:17 trace_options
-r--r--r--  1 root root 0 Jun 14 11:17 trace_pipe
-rw-r--r--  1 root root 0 Jun 14 11:17 tracing_cpumask
-rw-r--r--  1 root root 0 Jun 14 11:17 tracing_max_latency
-rw-r--r--  1 root root 0 Jun 14 11:17 tracing_on
-rw-r--r--  1 root root 0 Jun 14 11:17 tracing_thresh
-rw-r--r--  1 root root 0 Jun 14 11:17 uprobe_events
-r--r--r--  1 root root 0 Jun 14 11:17 uprobe_profile

mnt namespace of docker daemon

root@ip-10-116-71-191:/home/admin# DOCKER_PID=$(pidof dockerd)
root@ip-10-116-71-191:/home/admin# ls -l /proc/$DOCKER_PID/root/sys/kernel/debug/tracing
total 0
-r--r--r--  1 root root 0 Jun 14 11:17 available_events
-r--r--r--  1 root root 0 Jun 14 11:17 available_filter_functions
-r--r--r--  1 root root 0 Jun 14 11:17 available_tracers
-rw-r--r--  1 root root 0 Jun 14 11:17 buffer_size_kb
-r--r--r--  1 root root 0 Jun 14 11:17 buffer_total_size_kb
-rw-r--r--  1 root root 0 Jun 14 11:17 current_tracer
-r--r--r--  1 root root 0 Jun 14 11:17 dyn_ftrace_total_info
-r--r--r--  1 root root 0 Jun 14 11:17 enabled_functions
drwxr-xr-x 61 root root 0 Jun 14 11:17 events
--w-------  1 root root 0 Jun 14 11:17 free_buffer
drwxr-xr-x  2 root root 0 Jun 14 11:17 instances
-rw-r--r--  1 root root 0 Jun 14 11:17 kprobe_events
-r--r--r--  1 root root 0 Jun 14 11:17 kprobe_profile
-rw-r--r--  1 root root 0 Jun 14 11:17 max_graph_depth
drwxr-xr-x  2 root root 0 Jun 14 11:17 options
drwxr-xr-x 17 root root 0 Jun 14 11:17 per_cpu
-r--r--r--  1 root root 0 Jun 14 11:17 printk_formats
-r--r--r--  1 root root 0 Jun 14 11:17 README
-r--r--r--  1 root root 0 Jun 14 11:17 saved_cmdlines
-rw-r--r--  1 root root 0 Jun 14 11:17 saved_cmdlines_size
-rw-r--r--  1 root root 0 Jun 14 11:17 set_event
-rw-r--r--  1 root root 0 Jun 14 11:17 set_event_pid
-rw-r--r--  1 root root 0 Jun 14 11:17 set_ftrace_filter
-rw-r--r--  1 root root 0 Jun 14 11:17 set_ftrace_notrace
-rw-r--r--  1 root root 0 Jun 14 11:17 set_ftrace_pid
-r--r--r--  1 root root 0 Jun 14 11:17 set_graph_function
-r--r--r--  1 root root 0 Jun 14 11:17 set_graph_notrace
-rw-r--r--  1 root root 0 Jun 14 11:17 snapshot
-rw-r--r--  1 root root 0 Jun 14 11:17 stack_max_size
-r--r--r--  1 root root 0 Jun 14 11:17 stack_trace
-r--r--r--  1 root root 0 Jun 14 11:17 stack_trace_filter
-rw-r--r--  1 root root 0 Jun 14 11:17 trace
-rw-r--r--  1 root root 0 Jun 14 11:17 trace_clock
--w--w----  1 root root 0 Jun 14 11:17 trace_marker
-rw-r--r--  1 root root 0 Jun 14 11:17 trace_options
-r--r--r--  1 root root 0 Jun 14 11:17 trace_pipe
-rw-r--r--  1 root root 0 Jun 14 11:17 tracing_cpumask
-rw-r--r--  1 root root 0 Jun 14 11:17 tracing_max_latency
-rw-r--r--  1 root root 0 Jun 14 11:17 tracing_on
-rw-r--r--  1 root root 0 Jun 14 11:17 tracing_thresh
-rw-r--r--  1 root root 0 Jun 14 11:17 uprobe_events
-r--r--r--  1 root root 0 Jun 14 11:17 uprobe_profile

root@ip-10-116-71-191:/home/admin# docker run -it ubuntu find /sys/kernel/debug/
/sys/kernel/debug/

root@ip-10-116-71-191:/home/admin#  /etc/init.d/docker restart
[ ok ] Restarting docker (via systemctl): docker.service.

root@ip-10-116-71-191:/home/admin#  docker run -it ubuntu find /sys/kernel/debug/
/sys/kernel/debug/

POD=$(kubectl get pods --namespace=kube-system  -l app=weave-scope -owide | grep 10-116-71-191 | awk '{print $1}')

kubectl --namespace=kube-system logs $POD
time="2017-08-03T22:25:51Z" level=info msg="publishing to: https://cloud.weave.works:443" 
<probe> INFO: 2017/08/03 22:25:51.497560 command line args: --mode=probe --no-app=true --probe.docker=true --probe.docker.bridge=docker0 --probe.kubernetes=true --service-token=<elided>
<probe> INFO: 2017/08/03 22:25:51.497602 probe starting, version 1.6.2, ID 2e69ca37862da606
<probe> WARN: 2017/08/03 22:25:51.563259 Error setting up the eBPF tracker, falling back to proc scanning: cannot open kprobe_events: open /sys/kernel/debug/tracing/kprobe_events: no such file or directory
<probe> INFO: 2017/08/03 22:25:51.563507 kubernetes: targeting api server https://100.64.0.1:443
<probe> WARN: 2017/08/03 22:25:51.712225 Error collecting weave status, backing off 10s: Get http://127.0.0.1:6784/report: dial tcp 127.0.0.1:6784: getsockopt: connection refused
<probe> WARN: 2017/08/03 22:25:51.729564 Cannot resolve 'scope.weave.local.': dial tcp 172.17.0.1:53: getsockopt: connection refused
<probe> ERRO: 2017/08/03 22:25:51.757813 conntrack stderr:NOTICE: Netlink socket buffer size has been set to 8388608 bytes.
<probe> WARN: 2017/08/03 22:25:51.775648 Error collecting weave ps, backing off 20s: exit status 1: "Link not found\n"
<probe> INFO: 2017/08/03 22:25:51.849473 Control connection to cloud.weave.works starting
<probe> INFO: 2017/08/03 22:25:51.899318 Publish loop for cloud.weave.works starting
<probe> ERRO: 2017/08/03 22:25:51.923726 conntrack stderr:NOTICE: Netlink socket buffer size has been set to 8388608 bytes.
<probe> WARN: 2017/08/03 22:26:01.712757 Error collecting weave status, backing off 20s: Get http://127.0.0.1:6784/report: dial tcp 127.0.0.1:6784: getsockopt: connection refused
<probe> WARN: 2017/08/03 22:26:11.814133 Error collecting weave ps, backing off 40s: exit status 1: "Link not found\n"
<probe> WARN: 2017/08/03 22:26:21.713240 Error collecting weave status, backing off 40s: Get http://127.0.0.1:6784/report: dial tcp 127.0.0.1:6784: getsockopt: connection refused
rade commented 7 years ago

Hah! I know what the problem is. We forgot to update the k8s yamls for scope to include the /sys/kernel/debug mount point.

ajohnstone commented 7 years ago

E.g.

$ curl -Ls "https://cloud.weave.works/k8s.yaml?t=$TOKEN&k8s-version=$(kubectl version | base64 | tr -d '\n')" | sed '/serviceAccountName:/d'  > weave.yaml

# Patch with

$ diff weave.yaml.old weave.yaml
--- weave.yaml.old  2017-08-03 23:33:53.904032111 +0100
+++ weave.yaml  2017-08-03 23:35:26.096941060 +0100
@@ -116,6 +116,8 @@
                   mountPath: /var/run/docker.sock
                 - name: scope-plugins
                   mountPath: /var/run/scope/plugins
+                - name: sys-tracing
+                  mountPath: /sys/kernel/debug
           hostNetwork: true
           hostPID: true
           volumes:
@@ -125,6 +127,9 @@
             - name: scope-plugins
               hostPath:
                 path: /var/run/scope/plugins
+            - name: sys-tracing
+              hostPath:
+                path: /sys/kernel/debug
   - apiVersion: v1
     kind: ServiceAccount
     metadata:

$ kubectl apply -n kube-system -f weave.yaml
ajohnstone commented 7 years ago

Can confirm that works with debug mounted...

deploy → kubectl --namespace=kube-system logs weave-scope-agent-6n1sl
time="2017-08-03T23:30:17Z" level=info msg="publishing to: https://cloud.weave.works:443" 
<probe> INFO: 2017/08/03 23:30:17.263396 command line args: --mode=probe --no-app=true --probe.docker=true --probe.docker.bridge=docker0 --probe.kubernetes=true --service-token=<elided>
<probe> INFO: 2017/08/03 23:30:17.263439 probe starting, version 1.6.2, ID 7cd8c5e9449294a7
<probe> INFO: 2017/08/03 23:30:17.635207 kubernetes: targeting api server https://100.64.0.1:443
<probe> INFO: 2017/08/03 23:30:17.641013 Control connection to cloud.weave.works starting
<probe> WARN: 2017/08/03 23:30:17.762766 Error collecting weave status, backing off 10s: Get http://127.0.0.1:6784/report: dial tcp 127.0.0.1:6784: getsockopt: connection refused
<probe> WARN: 2017/08/03 23:30:17.773494 Cannot resolve 'scope.weave.local.': dial tcp 172.17.0.1:53: getsockopt: connection refused
<probe> WARN: 2017/08/03 23:30:17.814222 Error collecting weave ps, backing off 20s: exit status 1: "Link not found\n"
<probe> INFO: 2017/08/03 23:30:17.817249 Publish loop for cloud.weave.works starting
<probe> ERRO: 2017/08/03 23:30:17.863976 conntrack stderr:NOTICE: Netlink socket buffer size has been set to 8388608 bytes.
<probe> WARN: 2017/08/03 23:30:27.770931 Error collecting weave status, backing off 20s: Get http://127.0.0.1:6784/report: dial tcp 127.0.0.1:6784: getsockopt: connection refused