wkz / ply

Light-weight Dynamic Tracer for Linux
https://wkz.github.io/ply
GNU General Public License v2.0
399 stars 152 forks source link

errno 22 on kprobe loading but the ply script runs! #31

Open azharivs opened 3 years ago

azharivs commented 3 years ago

This is on a Redhat Enterprise 64 bit system. It is strange that I can run my ply scripts but when specifying the -d option it fails to run the script and the syscall to bpf returns -1 with errno set to 22.

# uname -a
3.10.0-1160.6.1.el7.x86_64 #1 SMP Tue Nov 10 17:23:25 PST 2020 x86_64 x86_64 x86_64 GNU/Linux
# /usr/local/sbin/ply -T
Verifying kernel config (/boot/config-3.10.0-1160.6.1.el7.x86_64)... OK
Ensuring that debugfs is mounted... OK
Verifying kprobe... OK
Verifying tracepoint... OK

# /usr/local/sbin/ply -v
ply 2.1.1-14-ge25c913 (linux-version:199168~3.10.0)

The error message and bpf instructions:

# /usr/local/sbin/ply 'kprobe:SyS_read { @["cnt"] = count();}' -dS

-- globals
unsigned long @{struct :anon_0x17455f0}
kprobe:SyS_read
{}void
    @=void
        []unsigned long
            @unsigned long{struct :anon_0x17455f0}
            :structstruct :anon_0x17455f0
                "cnt"char[8]
        countunsigned long

-- locals
-- ir
;; >pre  {}()
;; >pre  @=()
;; >pre  []()
;; >pre  @()
;; >post @()
;; >pre  :struct()
;; >pre  "cnt"
;; >post "cnt"
  0 stw [bp - 0x8], #0x746e63
  1 stw [bp - 0x4], #0x0
;; >post :struct()
;; >post []()
  2 ldmap   r1, @
  3 ldw r0, #0x0
  4 movq    r2, bp
  5 addq    r2, #-0x8
  6 call    map_lookup_elem
  7 jeq r0, #0x0, +6
  8 movq    r1, bp
  9 addq    r1, #-0x10
 10 movq    r2, #0x8
 11 movq    r3, r0
 12 call    probe_read
 13 ja  +1
L1:
 14 stq [bp - 0x10], #0x0
L2:
;; >pre  count()
;; >post count()
 15 ldq r0, [bp - 0x10]
 16 addq    r0, #0x1
 17 stq [bp - 0x10], r0
;; >post @=()
 18 ldmap   r1, @
 19 ldw r0, #0x0
 20 movq    r2, bp
 21 addq    r2, #-0x8
 22 movq    r3, bp
 23 addq    r3, #-0x10
 24 movq    r4, #0x0
 25 call    map_update_elem
;; >post {}()
 26 exit
error: unable to load kprobe:SyS_read, errno:22
warning: was ply built against the running kernel?
ERR:-22

Despite the error, the ply script runs when -d is not provided!!

# /usr/local/sbin/ply 'kprobe:SyS_read { @["cnt"] = count();}'
ply: active
^Cply: deactivating

@:
{ cnt     }: 460

I have attached a debugger and see that in syscall.c:int bpf_prog_load() the syscall to return syscall(__NR_bpf, BPF_PROG_LOAD, &attr, sizeof(attr)); returns -1 with errno = 22. According to man of bpf this is EINVAL which is due to bad bpf program:

"EINVAL For BPF_PROG_LOAD, indicates an attempt to load an invalid program. eBPF programs can be deemed invalid due to unrecognized instructions, the use of reserved fields, jumps out of range, infinite loops or calls of unknown functions." I could not find such a problem in the dumped bpf program.

Any ideas? Is there a way to get more details form the kernel? The contents of vlog where an empty string despite log_level = 1.

Also for reference here is the output of the following basic kprobe:

# /usr/local/sbin/ply 'kprobe:SyS_read { exit(0);}' -dS

-- globals
int stdbuf{u32}
kprobe:SyS_read
{}void
    bwritevoid
        ctxvoid __bpf *
        stdbufint{u32}
        :structstruct :anon_0x1fa85d0
            0u64
            :structstruct :anon_0x1fa8510
                0int

-- locals
void __bpf *ctx
-- ir
  0 movq    r2, r1
;; >pre  {}()
;; >pre  bwrite()
;; >pre  ctx()
;; >post ctx()
;; >pre  stdbuf()
;; >post stdbuf()
;; >pre  :struct()
  1 stw [bp - 0x4], #0x0
;; >pre  <0>
;; >post <0>
;; >pre  :struct()
;; >pre  <0>
;; >post <0>
;; >post :struct()
  2 stw [bp - 0x8], #0x0
;; >post :struct()
  3 stq [bp - 0x10], #0x0
;; >post bwrite()
  4 movq    r1, r2
  5 ldmap   r2, stdbuf
  6 ldw r0, #0x0
  7 movw    r3, #-0x1
  8 movq    r4, bp
  9 addq    r4, #-0x10
 10 movq    r5, #0x10
 11 call    perf_event_output
;; >post {}()
 12 exit
error: unable to load kprobe:SyS_read, errno:22
warning: was ply built against the running kernel?
ERR:-22
wkz commented 3 years ago

That is indeed very strange. The first thing that sticks out for me is of course the version on that kernel :eyes: ! So I guess Redhat has back-ported most of the known universe in order to make that work?

When you hit a bug with ply's code generator you will typically get lots of output from the kernel verifier. So my guess would be that you are hitting some guard before that is even invoked.

Could it be the case that they have not backported the log buffer stuff? What happens if you always keep attr.log_size and attr.log_level set to 0 in bpf_prog_load?

azharivs commented 3 years ago

I am assuming that the BPF kernel verifier is enabled given the same set of BPF kernel config options. Does it require any special option?