DynamoRIO / dynamorio

Dynamic Instrumentation Tool Platform
Other
2.65k stars 560 forks source link

HANG DrMem nudge test: private msvcrt entry crashes on thread exit invocation #1457

Closed derekbruening closed 9 years ago

derekbruening commented 9 years ago

From bruen...@google.com on June 13, 2014 14:02:14

The nudge test is hanging in release build only. This may be related to the FLS isolation, recently committed? But the test worked in that pre-commit suite, and it works on the bots. For me locally it is slightly non-det but release usually hangs and debug never does. Seems to be a DR bug even though this is a DrMem test.

The 2nd nudge hangs

% bin/drmemory.exe -dr d:/derek/dr/git/exports -batch -- tests/infloop.exe % bin/drmemory.exe -nudge drview -exe infloop.exe | awk '{print $2}' | sed 's/,//'

% bin/drmemory.exe -dr d:/derek/dr/git/exports -batch -- tests/infloop.exe Dr.M Dr. Memory version 1.6.1943 Dr.M Running "tests/infloop.exe" starting 8088 8088 Error #1: LEAK 42 direct bytes 0x01145608-0x01145632 + 17 indirect bytes 8088 # 0 replace_malloc [d:\derek\drmemory\git\src\common\alloc_replace.c:2333] 8088 # 1 foo [d:\derek\drmemory\git\src\tests\infloop.c:81] 8088 # 2 main [d:\derek\drmemory\git\src\tests\infloop.c:119] 8088 8088 Error #2: LEAK 160 direct bytes 0x011456f0-0x01145790 + 0 indirect bytes 8088 # 0 replace_malloc [d:\derek\drmemory\git\src\common\alloc_replace.c:2333] 8088 # 1 foo [d:\derek\drmemory\git\src\tests\infloop.c:92] 8088 # 2 main [d:\derek\drmemory\git\src\tests\infloop.c:119] 8088 8088 ERRORS FOUND: 8088 0 unique, 0 total unaddressable access(es) 8088 0 unique, 0 total uninitialized access(es) 8088 0 unique, 0 total invalid heap argument(s) 8088 0 unique, 0 total GDI usage error(s) 8088 0 unique, 0 total handle leak(s) 8088 0 unique, 0 total warning(s) 8088 2 unique, 21 total, 3259 byte(s) of leak(s) 8088 0 unique, 0 total, 0 byte(s) of possible leak(s) 8088 ERRORS IGNORED: 8088 1 potential error(s) (suspected false positives) 8088 (details: D:\derek\drmemory\git\build_x86_rel\logs\DrMemory-infloop.exe.9784.000\potential_errors.txt) 8088 3 unique, 3 total, 1115 byte(s) of still-reachable allocation(s) 8088 (re-run with "-show_reachable" for details) 8088 Details: D:\derek\drmemory\git\build_x86_rel\logs\DrMemory-infloop.exe.9784.000\results.txt

0:000> ~*kn

. 0 Id: 2638.53c Suspend: 1 Teb: 7efdd000 Unfrozen

ChildEBP RetAddr

WARNING: Frame IP not in any known module. Following frames may be wrong. 00 003bfe38 00801927 0x180e2adc 01 003bfe80 7636338a infloop!tmainCRTStartup+0x10b [f:\dd\vctools\crt_bld\self_x86\crt\src\crt0.c @ 278] 02 003bfe8c 77b59f72 kernel32!BaseThreadInitThunk+0xe 03 003bfecc 77b59f45 ntdll!RtlUserThreadStart+0x70 04 003bfee4 00000000 ntdll!_RtlUserThreadStart+0x1b

1 Id: 2638.1d68 Suspend: 1 Teb: 7efda000 Unfrozen

ChildEBP RetAddr

00 0026d844 77b58e44 ntdll!ZwWaitForSingleObject+0x15 01 0026d8a8 77b58d28 ntdll!RtlpWaitOnCriticalSection+0x13e 02 0026d8d0 774aa46a ntdll!RtlEnterCriticalSection+0x150 03 0026d8e0 774aaf5b msvcrt!_lock+0x30 04 0026d914 774aa501 msvcrt!_initptd+0x5b 05 0026d92c 774aa48c msvcrt!_core_crt_dll_init+0x24f 06 0026d938 683fe6f4 msvcrt!_CRTDLL_INIT+0x1b 07 0026d978 683a12f2 dynamorio!privload_call_entry+0xb4 [d:\derek\dr\git\src\core\win32\loader.c @ 1176] 08 0026d98c 6836f07e dynamorio!loader_thread_init+0x52 [d:\derek\dr\git\src\core\loader_shared.c @ 198] 09 0026d9c0 683e9ea2 dynamorio!dynamo_thread_init+0x1ee [d:\derek\dr\git\src\core\dynamo.c @ 2155] 0a 0026db48 683eb4f7 dynamorio!intercept_new_thread+0x82 [d:\derek\dr\git\src\core\win32\callback.c @ 3008] 0b 0026db58 6846a8a6 dynamorio!intercept_ldr_init+0x87 [d:\derek\dr\git\src\core\win32\callback.c @ 3258] 0c 0026db60 00000000 dynamorio!interception_code_array+0x8a6

2 Id: 2638.1558 Suspend: 1 Teb: 7efd7000 Unfrozen

ChildEBP RetAddr

00 016bf1b8 683f1232 ntdll!ZwWaitForSingleObject+0x15 01 016bf1c8 683e0225 dynamorio!nt_wait_event_with_timeout+0x12 [d:\derek\dr\git\src\core\win32\ntdll.c @ 3749] 02 016bf1f0 683e0435 dynamorio!os_wait_event+0x305 [d:\derek\dr\git\src\core\win32\os.c @ 8008] 03 016bf204 68386aba dynamorio!mutex_wait_contended_lock+0x95 [d:\derek\dr\git\src\core\win32\os.c @ 8056] 04 016bf210 683a12d4 dynamorio!acquire_recursive_lock+0x1a [d:\derek\dr\git\src\core\utils.c @ 994] 05 016bf224 6836f07e dynamorio!loader_thread_init+0x34 [d:\derek\dr\git\src\core\loader_shared.c @ 196] 06 016bf258 683e9ea2 dynamorio!dynamo_thread_init+0x1ee [d:\derek\dr\git\src\core\dynamo.c @ 2155] 07 016bf3e0 683eb4f7 dynamorio!intercept_new_thread+0x82 [d:\derek\dr\git\src\core\win32\callback.c @ 3008] 08 016bf3f0 6846a8a6 dynamorio!intercept_ldr_init+0x87 [d:\derek\dr\git\src\core\win32\callback.c @ 3258] 09 00000000 00000000 dynamorio!interception_code_array+0x8a6 0:000> ~1s eax=00000000 ebx=00000000 ecx=00000000 edx=00000000 esi=77540478 edi=00000000 eip=77b3f8d1 esp=0026d844 ebp=0026d8a8 iopl=0 nv up ei pl zr na pe nc cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000246 ntdll!ZwWaitForSingleObject+0x15: 77b3f8d1 83c404 add esp,4 0:001> kb ChildEBP RetAddr Args to Child
0026d844 77b58e44 000000a4 00000000 00000000 ntdll!ZwWaitForSingleObject+0x15 0026d8a8 77b58d28 00000000 00000000 774aa472 ntdll!RtlpWaitOnCriticalSection+0x13e 0026d8d0 774aa46a 77540478 1875d51c 0026d914 ntdll!RtlEnterCriticalSection+0x150 0026d8e0 774aaf5b 0000000c c8b12bc8 00000000 msvcrt!_lock+0x30 0026d914 774aa501 1875d51c 00000000 18af0600 msvcrt!_initptd+0x5b 0026d92c 774aa48c 00000000 1809b500 683fe6f4 msvcrt!_core_crt_dll_init+0x24f 0026d938 683fe6f4 774a0000 00000002 00000000 msvcrt!_CRTDLL_INIT+0x1b 0026d978 683a12f2 1809b500 00000002 0026dcc4 dynamorio!privload_call_entry+0xb4 [d:\derek\dr\git\src\core\win32\loader.c @ 1176] 0026d98c 6836f07e 18af0600 18af0600 00000000 dynamorio!loader_thread_init+0x52 [d:\derek\dr\git\src\core\loader_shared.c @ 198] 0026d9c0 683e9ea2 00000000 0026da0c 00000000 dynamorio!dynamo_thread_init+0x1ee [d:\derek\dr\git\src\core\dynamo.c @ 2155] 0026db48 683eb4f7 0026dcc4 00000001 0000005c dynamorio!intercept_new_thread+0x82 [d:\derek\dr\git\src\core\win32\callback.c @ 3008] 0026db58 6846a8a6 0026db64 00000000 6846a903 dynamorio!intercept_ldr_init+0x87 [d:\derek\dr\git\src\core\win32\callback.c @ 3258] 0026db60 00000000 6846a903 00000001 0000005c dynamorio!interception_code_array+0x8a6 0:001> dds 0026d8d0 0026d8d0 0026d8e0 0026d8d4 774aa46a msvcrt!_lock+0x30 0026d8d8 77540478 msvcrt!_CrtLock_Setlocale 0026d8dc 1875d51c 0026d8e0 0026d914 0026d8e4 774aaf5b msvcrt!_initptd+0x5b 0:001> dt RTL_CRITICAL_SECTION 77540478 infloop!RTL_CRITICAL_SECTION +0x000 DebugInfo : 0x1809c238 _RTL_CRITICAL_SECTION_DEBUG +0x004 LockCount : 0n-6 +0x008 RecursionCount : 0n1 +0x00c OwningThread : 0x00001f98 Void +0x010 LockSemaphore : 0x000000a4 Void +0x014 SpinCount : 0xfa0 0:001> ~

0 Id: 2638.53c Suspend: 1 Teb: 7efdd000 Unfrozen

. 1 Id: 2638.1d68 Suspend: 1 Teb: 7efda000 Unfrozen 2 Id: 2638.1558 Suspend: 1 Teb: 7efd7000 Unfrozen 0:001> ? 0n8088 Evaluate expression: 8088 = 00001f98

So the 1st nudge thread exited while still holding msvcrt!_CrtLock_Setlocale

It doesn't repro every time, but I did get it w/ some info:

pre-entry 2: msvcrt!_CrtLock_Setlocale->OwningThread==0 post-entry 2: msvcrt!_CrtLock_Setlocale->OwningThread==0 10096 ... 10096 Details: D:\derek\drmemory\git\build_x86_rel\logs\DrMemory-infloop.exe.5844.000\results.txt pre-entry 3: msvcrt!_CrtLock_Setlocale->OwningThread==0 privload_call_entry: msvcrt.dll entry routine crashed! post-entry 3: msvcrt!_CrtLock_Setlocale->OwningThread==10096

define DLL_PROCESS_ATTACH 1

define DLL_THREAD_ATTACH 2

define DLL_THREAD_DETACH 3

define DLL_PROCESS_DETACH 0

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=1457

derekbruening commented 9 years ago

From bruen...@google.com on June 13, 2014 15:02:47

tidtable.c: _mlock(_SETLOCALE_LOCK);

    __try {
        if ( (ptloci = ptd->ptlocinfo) != NULL )
        {
            __removelocaleref(ptloci);
            if ( (ptloci != __ptlocinfo) &&
                 (ptloci != &__initiallocinfo) &&
                 (ptloci->refcount == 0) )
                __freetlocinfo(ptloci);
        }
    }
    __finally {
        _munlock(_SETLOCALE_LOCK);
    }

It's in a try, but it's also in a DR TRY, and we short-circuit that priv lib try. So that's one bug: we don't let a priv lib handle its own fault.

But why is ptloci 1:

1abdeea4 774aa7db msvcrt!__removelocaleref+0x1c 1abdeea8 00000001 ... 1abdeebc 774aa922 msvcrt!_freefls+0xfb 1abdeec0 00000001 ... 1abdeef0 774aa9e2 msvcrt!_freeptd+0x35 1abdeef4 683e1b78 dynamorio!null_fragment

So _freeptd got null_fragment here: ptd = crtFlsGetValue(flsindex); 0:001> dd msvcrt!__flsindex 77540628 00000000 00000000 77542bd0 775426e0

Natively msvcrt!__flsindex starts out as -1 and then is 1. Is FLS slot 0 reserved or sthg (I recall it was always "taken" but had no data in it)?

The culprit turns out the be the failure to initialize the FLS slots to 0, but I also found several more problems with my initial FLS implementation:

\ CANCELED FLS slot 0 is reserved? CLOSED: [2014-06-13 Fri 18:01]

whatever, just going to hand it out

\ DONE RtlFlsFree needs to hold a lock CLOSED: [2014-06-13 Fri 17:53]

\ DONE FlsFree should call the fls callback if slot is non-NULL CLOSED: [2014-06-13 Fri 17:53]

My test fibers.c does not have the callbacks called. But MSDN says:

"If the FLS slot is in use, FlsCallback is called on fiber deletion, thread exit, and when an FLS index is freed. "

And:

"Freeing an FLS index also causes the associated callback routine to be called for each fiber, if the corresponding FLS slot contains a non-NULL value."

The call is here:

ntdll!RtlFlsFree+0x9c: 77b6dd0d 51 push ecx 77b6dd0e ff55f8 call dword ptr [ebp-8]

\ DONE do not call fls callback if slot is NULL CLOSED: [2014-06-13 Fri 17:53]

\ DONE bug in index used to access fls slots CLOSED: [2014-06-13 Fri 17:53]

FLS_DATA_OFFS vs TEB_FLS_DATA_OFFS

\ DONE must set fls slots to 0 initially CLOSED: [2014-06-13 Fri 17:53]

derekbruening commented 9 years ago

From derek.br...@gmail.com on June 13, 2014 21:58:17

This issue was closed by revision r2698 .

Status: Fixed