s117 / anycore-riscv

The AnyCore toolset targetting the RISC-V ISA
Other
0 stars 0 forks source link

RV64 program compiled by gcc-13.2.0 (with glibc2.38) crash if using standard output redirection #21

Closed s117 closed 10 months ago

s117 commented 10 months ago

Consider the sample program below:

#include <stdio.h>

int main()
{
    for (int i = 0; i < 99; i++)
        fprintf(stdout, "hello world - line %d!\n", i);
    return 0;
}

Save it to helloword.c, and compile it with the command:

riscv64-unknown-linux-gnu-gcc -static -Wl,-Ttext-segment,0x10000 -g -O0 -o ./helloworld.rv64 helloworld.c

Then run it with spike build from riscv-isa-sim @ e69ca83 and pk build from riscv-pk @ c2af001. So far so good:

$ spike -m256 pk helloworld.rv64
Requesting target memory 0x10000000
******* Resetting core ********** 
****Initializing the processor system****
******* Resetting core ********** 
******* Resetting core ********** 
****Initialization complete****
hello world - line 0!
hello world - line 1!
hello world - line 2!
hello world - line 3!
hello world - line 4!
hello world - line 5!
hello world - line 6!
hello world - line 7!
hello world - line 8!
hello world - line 9!
hello world - line 10!
hello world - line 11!
hello world - line 12!
hello world - line 13!
hello world - line 14!
hello world - line 15!
hello world - line 16!
hello world - line 17!
hello world - line 18!
hello world - line 19!
hello world - line 20!
hello world - line 21!
hello world - line 22!
hello world - line 23!
hello world - line 24!
hello world - line 25!
hello world - line 26!
hello world - line 27!
hello world - line 28!
hello world - line 29!
hello world - line 30!
hello world - line 31!
hello world - line 32!
hello world - line 33!
hello world - line 34!
hello world - line 35!
hello world - line 36!
hello world - line 37!
hello world - line 38!
hello world - line 39!
hello world - line 40!
hello world - line 41!
hello world - line 42!
hello world - line 43!
hello world - line 44!
hello world - line 45!
hello world - line 46!
hello world - line 47!
hello world - line 48!
hello world - line 49!
hello world - line 50!
hello world - line 51!
hello world - line 52!
hello world - line 53!
hello world - line 54!
hello world - line 55!
hello world - line 56!
hello world - line 57!
hello world - line 58!
hello world - line 59!
hello world - line 60!
hello world - line 61!
hello world - line 62!
hello world - line 63!
hello world - line 64!
hello world - line 65!
hello world - line 66!
hello world - line 67!
hello world - line 68!
hello world - line 69!
hello world - line 70!
hello world - line 71!
hello world - line 72!
hello world - line 73!
hello world - line 74!
hello world - line 75!
hello world - line 76!
hello world - line 77!
hello world - line 78!
hello world - line 79!
hello world - line 80!
hello world - line 81!
hello world - line 82!
hello world - line 83!
hello world - line 84!
hello world - line 85!
hello world - line 86!
hello world - line 87!
hello world - line 88!
hello world - line 89!
hello world - line 90!
hello world - line 91!
hello world - line 92!
hello world - line 93!
hello world - line 94!
hello world - line 95!
hello world - line 96!
hello world - line 97!
hello world - line 98!
******* Resetting core ********** 

However, if you run spike with standard output redirection, the RV64 program will crash:

$ spike -m256 pk helloworld.rv64 | cat
Requesting target memory 0x10000000
******* Resetting core ********** 
****Initializing the processor system****
******* Resetting core ********** 
******* Resetting core ********** 
****Initialization complete****
z  0000000000000000 ra 000000000001b7d4 sp 000000000ff6da80 gp 00000000000f6500
tp 00000000000fb9a0 t0 0000000000000001 t1 0000000200000001 t2 0000000000000000
s0 00000000000fb5f0 s1 0000000000000110 a0 0000000000000007 a1 00000000000f4028
a2 680a21303320656e a3 680a21303320658c a4 0000000000000111 a5 000000000000000f
a6 00000000000fb600 a7 7f2443cebe3590c0 s2 000000006c6c6568 s3 0000000000000004
s4 0000000000000001 s5 0000000000000001 s6 0000000000000000 s7 0000000000000002
s8 0000000000000000 s9 00000000000f40c0 sA 00000000000f65b0 sB 00000000000fa9c8
t3 0000000000000000 t4 0000000000000000 t5 000000000008e07e t6 0000000000010000
******* Resetting core ********** 
sr 00000000000000f9 pc 00000000000182ec va 680a21303320658c insn       ffffffff
User load segfault @ 0x680a21303320658c

$ spike -m256 pk helloworld.rv64 > stdout.txt                    
Requesting target memory 0x10000000
******* Resetting core ********** 
****Initializing the processor system****
******* Resetting core ********** 
******* Resetting core ********** 
****Initialization complete****
******* Resetting core ********** 

$ cat stdout.txt           
z  0000000000000000 ra 000000000001b7d4 sp 000000000ff6da80 gp 00000000000f6500
tp 00000000000fb9a0 t0 0000000000000001 t1 0000000200000001 t2 0000000000000000
s0 00000000000fb5f0 s1 0000000000000110 a0 0000000000000007 a1 00000000000f4028
a2 680a21303320656e a3 680a21303320658c a4 0000000000000111 a5 000000000000000f
a6 00000000000fb600 a7 7f2443cebe3590c0 s2 000000006c6c6568 s3 0000000000000004
s4 0000000000000001 s5 0000000000000001 s6 0000000000000000 s7 0000000000000002
s8 0000000000000000 s9 00000000000f40c0 sA 00000000000f65b0 sB 00000000000fa9c8
t3 0000000000000000 t4 0000000000000000 t5 000000000008e07e t6 0000000000010000
sr 00000000000000f9 pc 00000000000182ec va 680a21303320658c insn       ffffffff
User load segfault @ 0x680a21303320658c
s117 commented 10 months ago

If you examine the invalid memory address that the program is trying to read from (0x680a21303320658c), it is like a part of the output stream:

$ echo '0x680a21303320658c' | tac -r --separator='..' | xxd -r -p | xxd
00000000: 8c65 2033 3021 0a68                      .e 30!.h

Actually, by looking into the dumped execution stream, a3 was calculated from a2 plus an offset:

S/349826 C/349826 I/349786 PC/0x00000000000182d8 (0x00179693) slli    a3, a5, 1
    RS0/a5  0x0000000f
    RD/a3   0x0000001e

S/349827 C/349827 I/349787 PC/0x00000000000182dc (0x00d606b3) add     a3, a2, a3
    RS0/a2  0x680a21303320656e
    RS1/a3  0x0000001e
    RD/a3   0x680a21303320658c

S/349828 C/349828 I/349788 PC/0x00000000000182e0 (0x0006d583) lhu     a1, 0(a3)
    RS0/a3  0x680a21303320658c
    ADDR    0x680a21303320658c
    EXCEPTION   0x0000000000000001
    EVEC    0x0000000000004418
    ECAUSE  0x000000000000000a
    EPC 0x00000000000182e0
    SR  0x000000f9

And the value of a2 is really a part of the output stream: (... li)ne 30!\nh(ello ...)

$ echo '680a21303320656e' | tac -r --separator='..' | xxd -r -p | xxd 
00000000: 6e65 2033 3021 0a68                      ne 30!.h
s117 commented 10 months ago

TL;DR.

The problem is caused by the different behavior of sys_brk(0) in Linux and Proxy Kernel:

Analysis

%a3 is supposed to hold a pointer to tcache->counts[tc_idx], which can be observed from disassembly:

000000000001824c <_int_free>:
...
    if (tcache != NULL && tc_idx < mp_.tcache_bins)
   1828c:   02020793            add a5,tp,32 # 20 <thread_arena>
   18290:   0087b603            ld  a2,8(a5)
   18294:   03913c23            sd  s9,56(sp)
   18298:   03b13423            sd  s11,40(sp)
   1829c:   00050c93            mv  s9,a0
   182a0:   40060063            beqz    a2,186a0 <_int_free+0x454>
   182a4:   000dc597            auipc   a1,0xdc
   182a8:   d8458593            add a1,a1,-636 # f4028 <mp_>
   182ac:   0685b503            ld  a0,104(a1)
    size_t tc_idx = csize2tidx (size);
   182b0:   fef48793            add a5,s1,-17
   182b4:   0047d793            srl a5,a5,0x4
    if (__glibc_unlikely (e->key == tcache_key))
   182b8:   000e2d97            auipc   s11,0xe2
   182bc:   710d8d93            add s11,s11,1808 # fa9c8 <perturb_byte>
    if (tcache != NULL && tc_idx < mp_.tcache_bins)
   182c0:   02a7f463            bgeu    a5,a0,182e8 <_int_free+0x9c>
    if (__glibc_unlikely (e->key == tcache_key))
   182c4:   008db883            ld  a7,8(s11)
   182c8:   01843303            ld  t1,24(s0)
        if (cnt >= mp_.tcache_count)
   182cc:   0785b503            ld  a0,120(a1)
    tcache_entry *e = (tcache_entry *) chunk2mem (p);
   182d0:   01040813            add a6,s0,16
    if (__glibc_unlikely (e->key == tcache_key))
   182d4:   73130863            beq t1,a7,18a04 <_int_free+0x7b8>
    if (tcache->counts[tc_idx] < mp_.tcache_count)
   182d8:   00179693            sll a3,a5,0x1
   182dc:   00d606b3            add a3,a2,a3
   182e0:   0006d583            lhu a1,0(a3)
   182e4:   42a5e263            bltu    a1,a0,18708 <_int_free+0x4bc>
...

In the source, tcache is a thread-local variable defined as:static __thread tcache_perthread_struct *tcache = NULL;. The computation chain of its address also indicates it is a thread-local thing: it is loaded from a location pointed by the %tp register (thread pointer pointing to the TLS block):

S/349807 C/349807 I/349767 PC/0x000000000001828c (0x02020793) addi    a5, tp, 32
    RS0/tp  0x000fb9a0
    RD/a5   0x000fb9c0

S/349808 C/349808 I/349768 PC/0x0000000000018290 (0x0087b603) ld      a2, 8(a5)
    RS0/a5  0x000fb9c0
    RD/a2   0x680a21303320656e
    ADDR    0x000fb9c8

...

S/349827 C/349827 I/349787 PC/0x00000000000182dc (0x00d606b3) add     a3, a2, a3
    RS0/a2  0x680a21303320656e
    RS1/a3  0x0000001e
    RD/a3   0x680a21303320658c

S/349828 C/349828 I/349788 PC/0x00000000000182e0 (0x0006d583) lhu     a1, 0(a3)
    RS0/a3  0x680a21303320658c
    ADDR    0x680a21303320658c
    EXCEPTION   0x0000000000000001
    EVEC    0x0000000000004418
    ECAUSE  0x000000000000000a
    EPC 0x00000000000182e0
    SR  0x000000f9

So, definitely, the pointer value stored in the TLS block has been clobbered. To find who messed up this region, the trace was searched in the reserve order starting from the offending instruction (lhu a1,0(a3) @ 0x182e0) for all the memory accesses referencing the impacted region, i.e., the range [0x000fb9c8, 0x000fb9dd0). See below, I/<seq> is the sequence number of an instruction in the dynamic instruction stream:

S/349808 C/349808 I/349768 PC/0x0000000000018290 (0x0087b603) ld      a2, 8(a5)
    RS0/a5  0x000fb9c0
    RD/a2   0x680a21303320656e
    ADDR    0x000fb9c8
S/349783 C/349783 I/349743 PC/0x000000000001b788 (0x00873683) ld      a3, 8(a4)
    RS0/a4  0x000fb9c0
    RD/a3   0x680a21303320656e
    ADDR    0x000fb9c8
S/300383 C/300383 I/300343 PC/0x000000000001d148 (0xfed70fa3) sb      a3, -1(a4)
    RS0/a4  0x000fb9d0
    RS1/a3  0x00000068
    ADDR    0x000fb9cf
S/300065 C/300065 I/300025 PC/0x000000000001d1a4 (0x00c68023) sb      a2, 0(a3)
    RS0/a3  0x000fb9ce
    RS1/a2  0x0000000a
    ADDR    0x000fb9ce
S/300060 C/300060 I/300020 PC/0x000000000001d1a4 (0x00c68023) sb      a2, 0(a3)
    RS0/a3  0x000fb9cd
    RS1/a2  0x00000021
    ADDR    0x000fb9cd
S/299946 C/299946 I/299906 PC/0x000000000001d1a4 (0x00c68023) sb      a2, 0(a3)
    RS0/a3  0x000fb9cc
    RS1/a2  0x00000030
    ADDR    0x000fb9cc
S/299941 C/299941 I/299901 PC/0x000000000001d1a4 (0x00c68023) sb      a2, 0(a3)
    RS0/a3  0x000fb9cb
    RS1/a2  0x00000033
    ADDR    0x000fb9cb
S/299730 C/299730 I/299690 PC/0x000000000001d1a4 (0x00c68023) sb      a2, 0(a3)
    RS0/a3  0x000fb9ca
    RS1/a2  0x00000020
    ADDR    0x000fb9ca
S/299725 C/299725 I/299685 PC/0x000000000001d1a4 (0x00c68023) sb      a2, 0(a3)
    RS0/a3  0x000fb9c9
    RS1/a2  0x00000065
    ADDR    0x000fb9c9
S/299720 C/299720 I/299680 PC/0x000000000001d1a4 (0x00c68023) sb      a2, 0(a3)
    RS0/a3  0x000fb9c8
    RS1/a2  0x0000006e
    ADDR    0x000fb9c8
S/273680 C/273680 I/273641 PC/0x0000000000019770 (0x0087b783) ld      a5, 8(a5)
    RS0/a5  0x000fb9c0
    RD/a5   0x000fb250
    ADDR    0x000fb9c8
S/273632 C/273632 I/273593 PC/0x000000000001b07c (0x00843783) ld      a5, 8(s0)
    RS0/s0  0x000fb9c0
    RD/a5   0x000fb250
    ADDR    0x000fb9c8
...

So before I/273641 PC/0x0000000000019770 the tcache pointer kept in the TLS block still looks valid. The culprit is a bunch of byte store instructions with PC 0x1d1a4 and 0x1d148. The reconstructed call graph shows those instructions belong to the memcpy function called by fprintf, which was called by main to print lines to the standard output:

image

The call graph and instruction trace reveals the IO output buffer (for stdout and used by the fprintf) is [0x000fb710, 0x000fc710), which overlaps the region [0x000fb9c8, 0x000fb9dd0) for keeping the tcache pointer in the TLS block (related files:glibc/stdio-common/fprintf.c:32, glibc/stdio-common/vfprintf-internal.c:1522, glibc/stdio-common/printf_buffer_to_file.c:30, glibc/stdio-common/vfprintf-internal.c:1523). And this is why the tcache pointer stored in the TLB block was replaced by the strings to be printed to the stdout.

To dig out why the IO output buffer overlaps the TLS region, we need to locate the codes allocating the IO buffer and TLS block. Searching the first occurrence of 0x000fb710 (IO buffer for stdout) and 0x000fb9a0 (the TLS block pointed by register %tp) reveals the origin of those addresses:

image

image

Now, the real question is why _dl_early_allocate and malloc call will hand out memory regions that overlap with each other.

What happened here is _dl_early_allocate was called during the glibc start-up phase to allocate space for the TLS block. At this moment, __curbrk (a global variable used by glibc to track the current brk address of the main heap) has not been initialized yet. _dl_early_allocate in this case will not initialize __curbrk and will let the first call to brk in the future take care of the current brk variable initialization. It will just issue raw sys_brk(0) system call first to obtain the current brk address, then add it with the requested memory size, and finally make another sys_brk call to allocate the requested amount of space:

_dl_early_allocate (size_t size)
{
...
  if (__curbrk != NULL)
    ...
  else
    {
      /* If brk has not been invoked, there is no need to update
         __curbrk.  The first call to brk will take care of that.  */
      void *previous = __brk_call (0);
      result = __brk_call (previous + size);
      if (result == previous)
        result = NULL;
      else
        result = previous;
    }
...
  return result;
}

Later, when the glibc start-up routine begins to initialize the internal of malloc, __sbrk will be called the first time and initialize __curbrk.

image

This is done by by calling __brk function with argument 0. __brk is a glibc wrapper function for sys_brk system call. It updates __curbrk variable with the return value of sys_brk. Here comes the problem:

The previous toolchain port (GCC 9.2.0/glibc 2.29) did not experience a similar crash because this version of _dl_early_allocate function was implemented in 2022, and glibc 2.29 in the previous port will just initialize the TLS block with a direct call to __sbrk (which initializes __curbrk immediately).

Fix

One possible fix is making the proxy kernel's sys_brk behave like the Linux version, at least when the new brk is 0x0 it should do nothing and only return the current brk. Patch 5a31996 implemented this behavior and has been tested effective.

s117 commented 10 months ago

Fixed with 782daf4.