DynamoRIO / dynamorio

Dynamic Instrumentation Tool Platform
Other
2.61k stars 554 forks source link

32-bit xsave for signal frame update targets the wrong address and clobbers fpu state #3256

Open hgreving2304 opened 5 years ago

hgreving2304 commented 5 years ago

I am getting the following test failures on glinux for -m32, both for debug:

The following tests FAILED:
         74 - code_api|linux.bad-signal-stack (Failed)
        159 - code_api|client.signal (Failed)
        299 - code_api|api.static_sideline_FLAKY (Failed)
Errors while running CTest

... and the following ones for release:

The following tests FAILED:
          1 - unit_tests (Failed)
         38 - code_api|linux.signal0000 (Failed)
         39 - code_api|linux.signal0001 (Failed)
         40 - code_api|linux.signal0010 (Failed)
         41 - code_api|linux.signal0011 (Failed)
         42 - code_api|linux.signal0100 (Failed)
         43 - code_api|linux.signal0101 (Failed)
         44 - code_api|linux.signal0110 (Failed)
         45 - code_api|linux.signal0111 (Failed)
         46 - code_api|linux.signal1000 (Failed)
         47 - code_api|linux.signal1001 (Failed)
         48 - code_api|linux.signal1010 (Failed)
         49 - code_api|linux.signal1011 (Failed)
         50 - code_api|linux.signal1100 (Failed)
         51 - code_api|linux.signal1101 (Failed)
         52 - code_api|linux.signal1110 (Failed)
         53 - code_api|linux.signal1111 (Failed)
         54 - code_api|linux.sigplain000 (Failed)
         55 - code_api|linux.sigplain001 (Failed)
         56 - code_api|linux.sigplain010 (Failed)
         57 - code_api|linux.sigplain011 (Failed)
         58 - code_api|linux.sigplain100 (Failed)
         59 - code_api|linux.sigplain101 (Failed)
         60 - code_api|linux.sigplain110 (Failed)
         61 - code_api|linux.sigplain111 (Failed)
         89 - code_api|pthreads.ptsig (Failed)
        158 - code_api|client.signal (Failed)
        296 - code_api|api.static_sideline_FLAKY (Failed)
Errors while running CTest
hgreving2304 commented 5 years ago

client.signal also occasionally fails on Travis: https://travis-ci.com/DynamoRIO/dynamorio/jobs/188596852

<Application /home/travis/build/DynamoRIO/dynamorio/build_debug-internal-64/suite/tests/bin/client.signal (20888).  Internal Error: DynamoRIO debug check failure: /home/travis/build/DynamoRIO/dynamorio/core/utils.c:588 lock != &thread_initexit_lock || !is_self_couldbelinking()
(Error occurred @0 frags)
hgreving2304 commented 5 years ago

client.signal again: https://travis-ci.com/DynamoRIO/dynamorio/jobs/188867936

hgreving2304 commented 5 years ago

Recent client.signal failure on Travis: https://travis-ci.com/DynamoRIO/dynamorio/jobs/191940151

Carrotman42 commented 5 years ago

static_sideline again: http://dynamorio.org/CDash/testDetails.php?test=450996&build=43521

derekbruening commented 5 years ago

Why were unrelated Travis failures piled onto this issue? I am seeing signalNNNN 32-bit failures on a recent Linux distro and looked at them closely and they are from xsave problems. They have nothing to do with asserts like https://github.com/DynamoRIO/dynamorio/issues/3256#issuecomment-477744073 above nor with flakiness on Travis. I split the Travis failures off as #3813 . static_sideline is #3769 .

Please stop pasting Travis failures here!

derekbruening commented 5 years ago

The symptoms are fpu corruption after a signal. E.g., signal0000 prints out "nan" instead of "250006.749760".

We can see that everything looks like it's been shifted, like the xsave was done at the wrong spot:

execute_handler_from_dispatch for signal 11
original sigcontext 0x467a40e4:
<...>
        cw=0xffff037f
        sw=0xffff0000
        tag=0xffffffff
        ipoff=0x467458fd
        cssel=0x00000023
        dataoff=0x00000000
        datasel=0xffff002b
        st0 = 0000 0000 0000 0000 ^ 0000
        st1 = 0000 0000 0000 0000 ^ 0000
        st2 = 0000 0000 0000 0000 ^ 0000
        st3 = 0000 0000 0000 0000 ^ 0000
        st4 = 0000 0000 0000 0000 ^ 0000
        st5 = 0000 0000 0000 0000 ^ 0000
        st6 = 0000 0000 0000 0000 ^ 0000
        st7 = 9000 2be5 b601 d8e5 ^ 3ffe
        status=0x0000
        magic=0x0000
        fxsr_env[0] = 0x0000037f
        fxsr_env[1] = 0x00000000
        fxsr_env[2] = 0x467458fd
        fxsr_env[3] = 0x00000000
        fxsr_env[4] = 0x00000000
        fxsr_env[5] = 0x00000000
        mxcsr=0x00001f80
        reserved=0x0000ffff
        fxsr_st0 = 0000 0000 0000 0000 ^ 0000
        fxsr_st1 = 0000 0000 0000 0000 ^ 0000
        fxsr_st2 = 0000 0000 0000 0000 ^ 0000
        fxsr_st3 = 0000 0000 0000 0000 ^ 0000
        fxsr_st4 = 0000 0000 0000 0000 ^ 0000
        fxsr_st5 = 0000 0000 0000 0000 ^ 0000
        fxsr_st6 = 0000 0000 0000 0000 ^ 0000
        fxsr_st7 = 9000 2be5 b601 d8e5 ^ 3ffe
<...>
save_fpstate
ptr=0x467a4180
        temp=0x46794be0
new sigcontext 0x467a40e4:
<...>
        cw=0x0000037f
        sw=0x00000000
        tag=0x467458fd
        ipoff=0x00000000
        cssel=0x00000000
        dataoff=0x00000000
        datasel=0x00000000
        st0 = 0000 0000 0000 0000 ^ 0000
        st1 = 0000 0000 0000 0000 ^ 0000
        st2 = 0000 0000 0000 0000 ^ 0000
        st3 = 0000 0000 0000 0000 ^ 0000
        st4 = 0000 0000 0000 0000 ^ 0000
        st5 = 0000 0000 0000 0000 ^ 0000
        st6 = 0000 0000 0000 0000 ^ 0000
        st7 = 0000 0000 0000 0000 ^ 0000
        status=0x0000
        magic=0x0000
        fxsr_env[0] = 0x00000000
        fxsr_env[1] = 0x00000000
        fxsr_env[2] = 0x00000000
        fxsr_env[3] = 0x00000000
        fxsr_env[4] = 0x00000000
        fxsr_env[5] = 0x00000000
        mxcsr=0x00000000
        reserved=0x00000000
        fxsr_st0 = 9000 2be5 b601 d8e5 ^ 3ffe
        fxsr_st1 = 0000 0000 0000 0000 ^ 0000
        fxsr_st2 = 0000 0000 0000 0000 ^ 0000
        fxsr_st3 = 0000 0000 0000 0000 ^ 0000
        fxsr_st4 = 0000 0000 0000 0000 ^ 0000
        fxsr_st5 = 0000 0000 0000 0000 ^ 0000
        fxsr_st6 = 0000 0000 0000 0000 ^ 0000
        fxsr_st7 = 9000 2be5 b601 d8e5 ^ 3ffe

Xref #3812 investigations where we confirmed that for 32-bit the kernel has a prefixed fsave 0x70-byte region before an xsave region. DR already incorporates that into kernel_fpstate_t for 32-bit.

Xsave wants 64-byte alignment which is 0x40. Clearly the fpstate in a native signal frame is not aligned to that. Adding the 0x70 of the fsave prefix does get there:

0xffffc858  0xf7fd4089  __kernel_vsyscall + 9 in section .text of system-supplied DSO at 0xf7fd3000
0xffffc85c  0x00000023  No symbol matches (void *)$retaddr.
0xffffc860  0x00000282  No symbol matches (void *)$retaddr.
0xffffc864  0xffffcd5c  No symbol matches (void *)$retaddr.
0xffffc868  0x0000002b  No symbol matches (void *)$retaddr.
0xffffc86c  0xffffc890  No symbol matches (void *)$retaddr.

(gdb) p /x 0xffffc890+0x70
$1 = 0xffffc900

Yet DR aligns the start of kernel_fpstate_t and does the xsave there, incorrectly.

Note that in save_fpstate() DR is already doing "fxsave" and then convert_fxsave_to_fpstate() copies its fields in the prepended fsave region.

save_fpstate in execute_handler_from_dispatch is run on sigpending_t.xstate which is aligned.

The reason we never saw this problem before was that xstate_has_extra_fields was not set before, so xsave was never run. It's only set on newer machines.

We need to target the 0x70 offset. We need to align that portion in sigpending (or copy to a temp and then to the frame) and when we set the frame up on the stack: we do not want to align the actual start (fsave has no alignment restrictions right, if we or the kernel ever did that?).

derekbruening commented 5 years ago

Pasting from PR #3814: "For the AVX512F parts of the [linux.sigcontext] test, there are some ifdef X64 that excludes 32-bit, it doesn't print and test the state." We should see whether fixing this also resolves 32-bit issues with AVX-512 (#1312) and can let us enable more tests.