foss-for-synopsys-dwc-arc-processors / linux

Helpful resources for users & developers of Linux kernel for ARC
22 stars 13 forks source link

[GDB][native] Running GDB with a debugee crashes #48

Closed shahab-vahedi closed 3 years ago

shahab-vahedi commented 3 years ago

Runing gdb alone is fine.

$ gdb -q
(gdb)

But running it with a debugee crashes the simulator (both QEMU and nSIM):

# gdb -q /bin/ls

Oops
Path: /usr/bin/gdb
CPU: 0 PID: 109 Comm: gdb Not tainted 5.6.0 foss-for-synopsys-dwc-arc-processors/binutils-gdb#3
Insn could not be fetched
ECR: 0x00040000 EFA: 0x00000000 ERET: 0x00000000
STAT32: 0x80080802 [IE K     ]  BTA: 0x202ef1e8
BLK: EV_Trap+0xc4/0xc8
 SP: 0x9f1c7f80  FP: 0x00000028
LPS: 0x202bca74 LPE: 0x202bca92 LPC: 0x00000000
r00: 0xffffff9c r01: 0x5f971718 r02: 0x00000000
r03: 0x00000200 r04: 0x2034c5f4 r05: 0x00000001
r06: 0x2034c0d0 r07: 0x00000044 r08: 0x000001b7
r09: 0x00000000 r10: 0x00000000 r11: 0x202ef1e8
r12: 0x80811e64 r13: 0x80080802 r14: 0x00000000
r15: 0x5f971718 r16: 0xffffff9c r17: 0x000003ff
r18: 0x00000000 r19: 0x202bca74 r20: 0x202bca92
r21: 0x80811e64 r22: 0x202ef1e8 r23: 0x00000000
r24: 0x00000000 r25: 0x000001b7

Stack Trace:

nSIM flags

$ nsimdrv -prop=nsim_isa_family=av2hs                                 \
          -prop=nsim_isa_core=3                                       \
          -prop=chipid=0xffff                                         \
          -prop=nsim_isa_atomic_option=1                              \
          -prop=nsim_isa_ll64_option=1                                \
          -prop=nsim_mmu=4                                            \
          -prop=mmu_pagesize=8192                                     \
          -prop=mmu_super_pagesize=2097152                            \
          -prop=mmu_stlb_entries=16                                   \
          -prop=mmu_ntlb_ways=4                                       \
          -prop=mmu_ntlb_sets=128                                     \
          -prop=icache=32768,64,4,0                                   \
          -prop=dcache=16384,64,2,0                                   \
          -prop=nsim_isa_shift_option=3                               \
          -prop=nsim_isa_swap_option=1                                \
          -prop=nsim_isa_bitscan_option=1                             \
          -prop=nsim_isa_sat=1                                        \
          -prop=nsim_isa_div_rem_option=2                             \
          -prop=nsim_isa_mpy_option=9                                 \
          -prop=nsim_isa_enable_timer_0=1                             \
          -prop=nsim_isa_enable_timer_1=1                             \
          -prop=nsim_isa_number_of_interrupts=32                      \
          -prop=nsim_isa_number_of_external_interrupts=32             \
          -prop=isa_counters=1                                        \
          -prop=nsim_isa_pct_counters=8                               \
          -prop=nsim_isa_pct_size=48                                  \
          -prop=nsim_isa_pct_interrupt=1                              \
          -prop=nsim_mem-dev=uart0,kind=dwuart,base=0xf0000000,irq=24 \
          -prop=nsim_isa_aps_feature=1                                \
          -prop=nsim_isa_num_actionpoints=4                           \
          -prop=nsim_isa_rtc_option=1                                 \
          -prop=nsim_isa_ad_option=1                                  \
          -prop=nsim_isa_fpud_div_option=1                            \
          -prop=nsim_isa_fpu_mac_option=1                             \
          -prop=nsim_fast=1 vmlinux

QEMU flags

$ FTP=hostfwd=tcp::10021-:21
$ SSH=hostfwd=tcp::10022-:22
$ TLN=hostfwd=tcp::10023-:23
$ DBG=hostfwd=tcp::$(1337)-:$(1337)

$ qemu-system-arc -M virt                                            \
                  -nographic                                         \
                  -no-reboot                                         \
                  -cpu archs                                         \
                  -netdev user,id=net0,$(FTP),$(SSH),$(TLN),$(DBG)   \
                  -device virtio-net-device,netdev=net0              \
                  -append nokaslr                                    \
                  -kernel vmlinux

environment

toolchain: arc-2021.03 (glibc)

binutils:  arc-2021.03         0d6fe26e921
gcc:       arc-2021.03         72d713caf61
gdb:       arc-2021.03-gdb     3ffb584639c  
newlib:    arc-2021.03         9f32ccbdc
toolchain:                     4c5aa3d
glibc:     arc-glibc-master    9826b03b74
uclibc-ng: upstream            tag v1.0.37
linux: 5.1 from abrodkin's repo (64690b80b  https://github.com/abrodkin/linux.git) with a patch added as attachment at the end of this topic

$ build-all.sh --source-dir    ${SRCDIR}                            \
               --build-dir     ${BLDDIR}                            \
               --install-dir   ${INSTDIR}                           \
               --cpu           archs                                \
               --no-multilib                                        \
               --no-uclibc                                          \
               --glibc                                              \
               --target-cflags "-Og -g3 -fvar-tracking-assignments" \
               --jobs          8                                    \
               --no-pdf                                             \
               --no-auto-checkout                                   \
               --no-auto-pull                                       \
               --no-external-download                               \
               --no-strip                                           \
               --native-gdb                                         \
               --system-expat                                       \
               --no-elf32-strip-target-libs                         \
               --no-native                                          \
               --no-optsize-newlib                                  \
               --no-optsize-libstdc++

tested with a linux image built with the toolchain and glibc
linux:     5.6 release (obtained by buildroot)
busybox: 1.32

vmlinuxaa.zip vmlinuxab.zip vmlinuxac.zip

$ cat vmlinuxa* > vmlinux.tar.xz
$ tar xf vmlinux.tar.xz

abrodkin_linux_patch.zip

abrodkin commented 3 years ago

@shahab-vahedi may we move this one in https://github.com/foss-for-synopsys-dwc-arc-processors/toolchain/issues for better tracking as that's what we already have in existing tools but not in a WIP branch?

vineetgarc commented 3 years ago

So there are 2 problems here

  1. For some reason userspace passes syscall num 0x000001b7 (439) to kernel while the last valid syscall is 438 (there are NR_syscalls 439 syscall 0 to 438). syscall handler has a off-by-one error where it checks > NR_syscalls - it needs to check >= 439 cmp r8, NR_syscalls mov.hi r0, -ENOSYS

That would fix the kernel crash and kill the offending user process.

  1. The main problem is userspace invoking this syscall - so we need to see the binary (or have native objdump in rootfs) to do that.
vineetgarc commented 3 years ago

@shahab-vahedi can u please upload the rootfs for this issue here.

vineetgarc commented 3 years ago

I tried NFS mounting host from qemu, but thats being rejected due to invalid port numbers from ARC side (not sure of this is a qemu problem or something else).

Apr 23 09:06:50 rpc.mountd[1322]: refused mount request from xxx: illegal port 33880 long story short I can't objdump the target libs on target or host, and hence need access to them externally (as stated above)

cupertinomiranda commented 3 years ago

@shahab-vahedi may we move this one in https://github.com/foss-for-synopsys-dwc-arc-processors/toolchain/issues for better tracking as that's what we already have in existing tools but not in a WIP branch?

My personal opinion is that if @shahab-vahedi thinks this is a gdb issue, let it stay under his "umbrella" until that is no longer the case, otherwise everything should be in the toolchain.

cupertinomiranda commented 3 years ago

Sorry pressed close by mistake

shahab-vahedi commented 3 years ago

@shahab-vahedi may we move this one in https://github.com/foss-for-synopsys-dwc-arc-processors/toolchain/issues for better tracking as that's what we already have in existing tools but not in a WIP branch?

My personal opinion is that if @shahab-vahedi thinks this is a gdb issue, let it stay under his "umbrella" until that is no longer the case, otherwise everything should be in the toolchain.

Thanks @cupertinomiranda . That was exactly my goal. When it is solved, I will move it.

shahab-vahedi commented 3 years ago

Changing the code in linux-5.6/arch/arc/kernel/entry.S:

80811e48:   208c 9dc6               cmp r8,439
80811e4c:   20ca 0f8d ffff ffda     mov.hi  r0,0xffffffda
80811e54:   0010 000d               bhi 16  ;80811e64 <EV_Trap+0xc4>

to

80811e48:   208c 9dc6               cmp r8,439
80811e4c:   20ca 0f8a ffff ffda     mov.ge  r0,0xffffffda
80811e54:   0010 000a               bge 16  ;80811e64 <EV_Trap+0xc4>

Solves the issue. Thank you @vineetgarc for all the support.

vineetgarc commented 3 years ago

Just for completeness whats going on here is

  1. There are 2 faccesat syscall in linux kernel.

    define __NR_faccessat 48 # legacy

    define __NR_faccessat2 439 # added in v5.8

  2. glibc 2.33 is "5.10 kernel capable" meaning it has "fast paths" for 5.10 but falls back if the kernel lacks capabilities (this is generic glibc for all arches). so glibc faccessat() lib call first tries to invoke 439 (and if kernel returns -ENOSYS it would fall back to legacy 48).

  3. ARC Linux (v5.6) being used for this exercise has max syscalls supported 0 to 438, so when it sees syscall 439 it needs to return -ENOSYS so glibc falls back to slow path syscall 48.

  4. But due to off by one kernel bug, it proceeds to handle syscall 439 which causes a null pointer deref and hence the crash.

vineetgarc commented 3 years ago

Patch posted to lkml, will be sent Linus way soon ! http://lists.infradead.org/pipermail/linux-snps-arc/2021-April/005008.html

vineetgarc commented 3 years ago

@shahab-vahedi is this still an issue ?

shahab-vahedi commented 3 years ago

Not from my side. You know when the patch will land. So you will be a good judge when to close this.

vineetgarc commented 3 years ago

Fix merged in v5.13-rc2 3433adc8bd09fc9f29b8baddf33b4ecd1ecd2cdc