iovisor / bcc

BCC - Tools for BPF-based Linux IO analysis, networking, monitoring, and more
Apache License 2.0
20.57k stars 3.88k forks source link

ISRA breaks kprobes #1754

Open bobrik opened 6 years ago

bobrik commented 6 years ago

GCC can do optimizations and rename functions in the kernel (see stackoverflow question).

$ cat /proc/kallsyms | fgrep do_wakeup
ffffffffa2483420 t ttwu_do_wakeup
$ cat /proc/kallsyms | fgrep do_wakeup
ffff8692fe2a6ff8 t ttwu_do_wakeup.isra.20

We have tools that directly attach to ttwu_do_wakeup kprobe, and ISRA breaks those.

What's the suggested way of working around this?

yonghong-song commented 6 years ago

I compiled the kernel for x86 with -fipa-sra and I did not get .isra.# symbols.

diff --git a/Makefile b/Makefile
index 619a85a..df90968 100644
--- a/Makefile
+++ b/Makefile
@@ -362,7 +362,7 @@ HOST_LFS_LIBS := $(shell getconf LFS_LIBS)

 HOSTCC       = gcc
 HOSTCXX      = g++
-HOSTCFLAGS   := -Wall -Wmissing-prototypes -Wstrict-prototypes -O2 \
+HOSTCFLAGS   := -Wall -Wmissing-prototypes -Wstrict-prototypes -O2 -fipa-sra \
                -fomit-frame-pointer -std=gnu89 $(HOST_LFS_CFLAGS)
 HOSTCXXFLAGS := -O2 $(HOST_LFS_CFLAGS)
 HOSTLDFLAGS  := $(HOST_LFS_LDFLAGS)

What is you compilation flags form arm64?

bobrik commented 6 years ago

I have plenty of isra symbols on both arm64 and amd64:

$ uname -m; cat /proc/kallsyms | fgrep isra | wc -l
x86_64
785
$ cat /proc/kallsyms | fgrep isra | wc -l
933

We don't set any special compilation flags during build.

This is happening on both Debian Jessie and Stretch, 4.9 and 4.14 kernels. On Docker for mac I see the same picture. I don't have anything else at hand to check.

brendangregg commented 6 years ago

I have about 950 isra symbols too (both stock Ubuntu kernel 4.4, and my own built kernels on Ubuntu).

Could attach_kprobe() just deal with this? Ie, if you try attaching to "tcp_options" and that symbol isn't there, but "tcp_options.isra.*" is there, then just use that instead.

Looks like something else needs to be fixed with bcc/eBPF as well:

# /usr/share/bcc/tools/funccount tcp_options
No functions matched by pattern ^tcp_options$
# /usr/share/bcc/tools/funccount tcp_options.isra.10
No functions matched by pattern ^tcp_options.isra.10$

Works with ftrace:

# /apps/perf-tools/bin/funccount tcp_options.isra.10
Tracing "tcp_options.isra.10"... Ctrl-C to end.
^C
FUNC                              COUNT
tcp_options.isra.10                   4

Ending tracing...
yonghong-song commented 6 years ago

Somehow, it seems working for me. Latest bcc, llvm & bpf-next:

[yhs@localhost tools]$ cat /proc/kallsyms | grep tcp_options
0000000000000000 t tcp_options.isra.11
0000000000000000 t tcp_options_write
[yhs@localhost tools]$ sudo ./funccount.py tcp_options.isra.11
Tracing 1 functions for "b'tcp_options.isra.11'"... Hit Ctrl-C to end.
^C
FUNC                                    COUNT
Detaching...
[yhs@localhost tools]$

Could you try latest bcc to see whether it still works or not?

bobrik commented 6 years ago

Can it be related to functions being in modules? I've tried a few from modules and each failed, while built-in ones worked, doesn't matter isra or not.

$ cat /proc/kallsyms | grep -E '(ttwu_do_wakeup.isra.|tcp_options.isra).*'
ffff8692fe2a6ff8 t ttwu_do_wakeup.isra.20
ffff851fa2b99b80 t tcp_options.isra.1   [nf_conntrack]
$ sudo /usr/share/bcc/tools/funccount -d 1 tcp_options.isra.1
No functions matched by pattern ^tcp_options.isra.1$
$ sudo /usr/share/bcc/tools/funccount -d 1 ttwu_do_wakeup.isra.20
Tracing 1 functions for "ttwu_do_wakeup.isra.20"... Hit Ctrl-C to end.

FUNC                                    COUNT
ttwu_do_wakeup.isra.20                   1393
Detaching...
yonghong-song commented 6 years ago

@bobrik could you try the latest bcc, it should be able to handle modules.

bobrik commented 6 years ago

@yonghong-song, we're using fe966bbd2ef7bc0325a93762a1332c31c81a2c27, which is 5 days / 4 commits old. Has something changed since then? I don't see any obviously related changes.

yonghong-song commented 6 years ago

@bobrik You should be fine though with the above repo head. Then I do not know why you cannot trace tcp_options.isra.1 even if it is in a module. bcc should support this.

bobrik commented 6 years ago

@yonghong-song I checked with the latest bcc and it works.

I agree with @brendangregg on automatically handling isra. Is this the place where it should happen?

yonghong-song commented 6 years ago

The best place to auto handling isra is probably in python API attach_kprobe and attach_kretprobe. For C++, which is typically for long running programs, we can let the application handling this.

brendangregg commented 5 years ago

FWIW, just tried again on 4.15, and these isra symbols now work with bcc.

bobrik commented 5 years ago

@brendangregg can you elaborate? I'm using bcc 0.7.0 and 4.19.10 and non-isra symbols do not work:

$ uname -r
4.19.10

$ sudo cat /proc/kallsyms | fgrep tcp_ack_update_rtt
ffffffff8f331e40 t tcp_ack_update_rtt.isra.47

$ sudo /usr/share/bcc/tools/funccount tcp_ack_update_rtt
No functions matched by pattern ^tcp_ack_update_rtt$

$ sudo /usr/share/bcc/tools/funccount tcp_ack_update_rtt.isra.47
Tracing 1 functions for "tcp_ack_update_rtt.isra.47"... Hit Ctrl-C to end.
^C
FUNC                                    COUNT
tcp_ack_update_rtt.isra.47              55136
Detaching...
brendangregg commented 5 years ago

Yes, I'm matching the full name including the ".isra.47" like you did, since that's in the symbol table.

Oh, were you expecting "tcp_ack_update_rtt" to automatically match "tcp_ack_update_rtt.isra.47"? Hmm.. maybe that's how it should work, and we should fix it to do that (plus support matching the full .isra symbol as well, for whatever reason).

Anyone know if that would be a bad idea?

ethercflow commented 5 years ago

@bobrik @yonghong-song @brendangregg I've another question about this. The -fipa-sra's explain:

-fipa-sra Perform interprocedural scalar replacement of aggregates, removal of unused parameters and replacement of parameters passed by reference by parameters passed by value. Enabled at levels -O2, -O3 and -Os.

So, I think the function signature will change (Here's an example interprocedural-optimization-in-gcc), I pick the code below:

Original

struct bovid
{
  float red;
  int green;
  void *blue;
};

static void ox(struct bovid *cow)
{
  cow->red = cow->red + cow->green;
}

int main(void)
{
  struct bovid cow;

  cow.red = 7.4;
  cow.green = 6;
  cow.blue = &cow;

  ox(&cow);

  return 0;
}

After enable -fipa-sra, the code changed to:

struct bovid
{
  float red;
  int green;
  void *blue;
};

static void ox(float *t1, int t2)
{
  *t1 = *t1 + t2;
}

int main(void)
{
  struct bovid cow;

  cow.red = 7.4;
  cow.green = 6;
  cow.blue = &cow;

  ox(&cow.red, cow.green);

  return 0;
}

We can see the ox's function signature has been changed, so how to parse correct parameters maybe a difficult thing.

What do you think about it?

eklitzke commented 4 years ago

I just ran into this, and I am a user of the C++ API. It seems like it would be pretty simple to handle this automatically when /proc/kallsyms is available (as @brendangregg already suggested). I am thinking of something like:

I estimate the sorted vector would take up ~3.5MB of memory on my computer. It's probably fine to lazy initialize it, and/or make it opt-in as an optional argument to attach_kprobe(). I will try to find some time to write code for this and submit it as a PR to see if people are interested.

As per the point made above by @ethercflow I agree the change in function signature can be a problem, but this is sort of an edge case and there's nothing that can be done about it.

masibw commented 3 years ago

Hi, I need this feature. It seems to have been implemented, but not merged. Is there any progress?

yonghong-song commented 3 years ago

I cannot have this feature on by default. The .isa version of function may change function parameters and bpf program may then make incorrect assumptions w.r.t. parameters. Unfortunately the dwarf does not encode such optimizations.

masibw commented 3 years ago

Thanks for the reply. It is possible that the function parameters are different, but in that case is it difficult to get an error that the function parameters are different? Or does the verifier check fail? If it does fail, it should work if the parameters do not change, and if the parameters do change, it should give a more appropriate error than cannot attach kprobe, probe entry may not exist.

Or how about creating another attach_kprobe method that searches for isra? (A method that allows the user to tolerate that the parameters of the function may change and not work properly.)

yonghong-song commented 3 years ago

Thanks for the reply. It is possible that the function parameters are different, but in that case is it difficult to get an error that the function parameters are different? Or does the verifier check fail? If it does fail, it should work if the parameters do not change, and if the parameters do change, it should give a more appropriate error than cannot attach kprobe, probe entry may not exist.

No. There is no debuginfo for changed .isra functions, so there will be no errors from verifier. The bpf program assumes the original function signature, which may not be correct.

Or how about creating another attach_kprobe method that searches for isra? (A method that allows the user to tolerate that the parameters of the function may change and not work properly.)

There is no need to have a different attach_kprobe. But we could provide an option to the tool, e.g., --allow-isra. If this option is specified, the isra function can be used. The assumption is people use this option MAY have checked that changed signature matches bpf program. If --allow-isra is not specified, if kprobe attachment failed, the tool can try to detect whether .isra function exists or not and issue a possible suggestion with --allow-isra. What do you think?

masibw commented 3 years ago

No. There is no debuginfo for changed .isra functions, so there will be no errors from verifier. The bpf program assumes the original function signature, which may not be correct.

I understand, thank you.

But we could provide an option to the tool, e.g., --allow-isra.

I think that's a good idea, but I don't seem to be able to programmatically check the kprobe arguments to determine whether to use isra. So it would be better to be able to choose whether to allow isra in the program.

yonghong-song commented 3 years ago

No. There is no debuginfo for changed .isra functions, so there will be no errors from verifier. The bpf program assumes the original function signature, which may not be correct.

I understand, thank you.

But we could provide an option to the tool, e.g., --allow-isra.

I think that's a good idea, but I don't seem to be able to programmatically check the kprobe arguments to determine whether to use isra. So it would be better to be able to choose whether to allow isra in the program.

What do you mean "to choose whether to allow isra in the program"? Do you mean bpf program? I am thinking to use the same bpf program and provides option --allow-isra and assumes people use this option can assert that the function signature does not change. For function signature change cases, yes, we may need a different bpf program and for complex functions there may be more than one way to do isra optimizations. So I am inclined to leave out these signature changing isra functions for now. WDYT?

masibw commented 3 years ago

What do you mean "to choose whether to allow isra in the program"?

I was assuming that the user program would check the kprobe argument, and if it allowed isra, it would attach the same bpf program or give an error. I don't know in detail how to check the signature of kprobe, but depending on your environment, you can check it in /sys/kernel/debug/tracing/events/kprobe, right? Wouldn't the user need to check the source code of the tool to make sure that the signature of kprobe is the same as the original case? I think that would be a bit tricky.

yonghong-song commented 3 years ago

What do you mean "to choose whether to allow isra in the program"?

I was assuming that the user program would check the kprobe argument, and if it allowed isra, it would attach the same bpf program or give an error. I don't know in detail how to check the signature of kprobe, but depending on your environment, you can check it in /sys/kernel/debug/tracing/events/kprobe, right?

/sys/kernel/debug/tracing/events/kprobe is not enough. it just gives you attachment (func_name + optional offset).

Wouldn't the user need to check the source code of the tool to make sure that the signature of kprobe is the same as the original case? I think that would be a bit tricky.

This is the tricky part. The isra may have a different signature than the original function. Just looking at the source code may not be enough. It many cases, you may need to look at compiler transformation details or final assembly code to see what kinds of changes in signature or there is no signature at all.

For each specific tool, we could do analysis for any specific function and use isra properly as in general there won't be a lot of wiggle room of isra optimization for a particular function.

I think llvm can help here. Currently I am working on adding btf_tag support in llvm (https://reviews.llvm.org/D106614). The full support (IR/dwarf/BPF changes) is not merged yet. But for isra function, we could add a btf_tag, e.g., for function "foo", suppose we have a foo.isra, we can have a btf_tag like btf_tag for function "foo", value "foo.isra arg1 arg2 ..." The details need to be designed but you see the idea, we try to encode foo.isra signature to vmlinux.BTF. So if users want to track all instances of "foo" or "foo.isra". They have both function signatures in vmlinux.BTF.

masibw commented 3 years ago

/sys/kernel/debug/tracing/events/kprobe is not enough. it just gives you attachment (func_name + optional offset).

/sys/kernel/debug/tracing/events/kprobe is not enough. it just gives you attachment (func_name + optional offset).

This is the tricky part. The isra may have a different signature than the original function. Just looking at the source code may not be enough. It many cases, you may need to look at compiler transformation details or final assembly code to see what kinds of changes in signature or there is no signature at all.

I'm not familiar with llvm, but it looks good. Until that feature is advanced, will it be difficult for bcc to support isra? --allow-isra is a nice addition, but I don't think it will be used much in practice.

yonghong-song commented 3 years ago

Agree. --allow-isra is hard to use as the user may have a hard time to check whether isra version has the same signature as the original one. So let us wait for llvm support then.

masibw commented 3 years ago

So let us wait for llvm support then.

Yes, thank you for your detailed explanation.