Open 201341 opened 5 years ago
One possibility is that the comm
or name
strings has garbage left after \0
terminator, but BPF hash table will still treat them as distinct keys.
The hacking below showed @palmtenor's point above is right.
diff --git a/tools/filetop.py b/tools/filetop.py
index ccc5a107..619b2897 100755
--- a/tools/filetop.py
+++ b/tools/filetop.py
@@ -107,8 +107,26 @@ static int do_entry(struct pt_regs *ctx, struct file *file,
// store counts and sizes by pid & file
struct info_t info = {.pid = pid};
bpf_get_current_comm(&info.comm, sizeof(info.comm));
+ int i;
+ int start = -1;
+ #pragma unroll
+ for (i = 0; i < TASK_COMM_LEN; i++) {
+ if (start > 0)
+ info.comm[i] = 0;
+ else if (info.comm[i] == 0)
+ start = 1;
+ }
+
info.name_len = d_name.len;
bpf_probe_read(&info.name, sizeof(info.name), d_name.name);
+ start = -1;
+ #pragma unroll
+ for (i = 0; i < DNAME_INLINE_LEN; i++) {
+ if (start > 0)
+ info.name[i] = 0;
+ else if (info.name[i] == 0)
+ start = 1;
+ }
if (S_ISREG(mode)) {
info.type = 'R';
} else if (S_ISSOCK(mode)) {
We may need a better kernel or bcc interface to deal with this nicely. Any suggestions?
I used to use bpf_probe_read
's error behavior to serve as a poorman's bzero
...
@palmtenor I forgot how did you exactly do this? if bzero
the whole thing, __builtin_memset
should work. Here, we want to partial bzero
.
Yes, it's just that __builtin_memset
consumes too many instructions for larger buffer. And I think I was actually using bpf_perf_event_read_value
and provide it with an invalid struct size, and the helper would just zero out the buffer...
Thanks for your repley. Is there any suggestion to deal with these? and I find another question, when I use command cat
to read a small file whose size is 9 bit. And the filetop show the r_kb is 128KB
as follow
TID COMM READS WRITES R_Kb W_Kb T FILE
21437 cat 2 0 128 0 R testv.txt
21438 cat 2 0 128 0 R testv.txt
[root@node ~]# stat testv.txt
File: ‘testv.txt’
Size: 9 Blocks: 8 IO Block: 4096 regular file
Device: fd00h/64768d Inode: 70603259 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Context: unconfined_u:object_r:admin_home_t:s0
Access: 2019-05-20 14:48:13.231039857 +0800
Modify: 2019-05-20 14:48:02.460952507 +0800
Change: 2019-05-20 14:48:02.460952507 +0800
For the original question, using bpf_probe_read_str()
helper can fix the issue.
-bash-4.4$ git diff
diff --git a/tools/filetop.py b/tools/filetop.py
index ccc5a107..002d0747 100755
--- a/tools/filetop.py
+++ b/tools/filetop.py
@@ -108,7 +108,7 @@ static int do_entry(struct pt_regs *ctx, struct file *file,
struct info_t info = {.pid = pid};
bpf_get_current_comm(&info.comm, sizeof(info.comm));
info.name_len = d_name.len;
- bpf_probe_read(&info.name, sizeof(info.name), d_name.name);
+ bpf_probe_read_str(&info.name, sizeof(info.name), d_name.name);
if (S_ISREG(mode)) {
info.type = 'R';
} else if (S_ISSOCK(mode)) {
-bash-4.4$
But bpf_probe_read_str
is not available in earlier kernel versions. But you can check /proc/kallsyms
for bpf_probe_read_str
for its availability.
Do you mind to submit a pull request for this?
For the other issue related to r_kb is 128KB
. The issue is due to read
system call.
For read system call, ssize_t read(int fd, void *buf, size_t count)
, the count
may not
be the actual count, it is the buffer size to read, using strace cat testv.txt
you probably
will find the count
is 64KB which is actually the buffer size.
To get accurate read size, need to use kretprobe instead of kprobe for vfs_read. Maybe during kprobe, create a hash table with key = task_struct, and during kretprobe, based on the same task struct to fill in read_size? Since you are studying this, could you explore this idea further?
Thanks for your comment. It helps me a lot, I will pull request for this later.
I use the filetop.py to show reads and writes by file with process. But find something strange hanpens. the result is as follow
The line from one to five has same process id, comm, and file. why they print on different lines? Is the api
map.lookup_or_init()
wrong or some reasons?