DynamoRIO / dynamorio

Dynamic Instrumentation Tool Platform
Other
2.66k stars 562 forks source link

self-interpretation due to os_take_over_all_unknown_threads() on an already-caught early thread #1443

Open derekbruening opened 9 years ago

derekbruening commented 9 years ago

From bruen...@google.com on May 07, 2014 10:22:48

This is the DR bug causing https://code.google.com/p/drmemory/issues/detail?id=1541 Analyzing the ldmp there: http://build.chromium.org/p/chromium.fyi/builders/Windows%20Browser%20%28DrMemory%20full%29%20%282%29/builds/3534/steps/memory%20test%3A%20browser_tests/logs/stdio UsbApiTest.ZeroLengthTransfer

[ RUN      ] UsbApiTest.ZeroLengthTransfer
[284:816:0506/111048:ERROR:gpu_info_collector_win.cc(103)] Can't retrieve a valid WinSAT assessment.
[1344:1948:0506/111203:ERROR:renderer_main.cc(227)] Running without renderer sandbox
~~360~~ 
~~360~~ Error `#1`: UNINITIALIZED READ: reading register edi
~~360~~ # 0 dynamorio.dll!intercept_ldr_init      [e:\b\build\slave\win-builder\drmemory\dynamorio\core\win32\callback.c:3202]
~~360~~ # 1 dynamorio.dll!interception_code_array
~~360~~ Note: @0:00:09.421 in thread 360
~~360~~ Note: instruction: mov    $0x00000000 -> (%edi)
~~360~~ 
~~360~~ Error `#2`: UNINITIALIZED READ: reading register edi
~~360~~ # 0 dynamorio.dll!asynch_take_over [e:\b\build\slave\win-builder\drmemory\dynamorio\core\win32\callback.c:2833]
~~360~~ Note: @0:00:09.422 in thread 360
~~360~~ Note: instruction: cmp    (%edi) $0x00
~~360~~ 
~~360~~ Error `#3`: UNADDRESSABLE ACCESS: reading 0x00000246-0x0000024a 4 byte(s)
~~360~~ # 0 dynamorio.dll!dispatch_enter_fcache [e:\b\build\slave\win-builder\drmemory\dynamorio\core\dispatch.c:488]
~~360~~ Note: @0:00:09.435 in thread 360
~~360~~ Note: instruction: pop    %esp (%esp) -> %ebp %esp
~~360~~ Note: memory dump created at C:\Users\chrome-bot\AppData\LocalLow\drmemory.logs\testcase.2280_4.logs\dynamorio\browser_tests.exe.2292.00000000.ldmp

0:000> kn 30
 # ChildEBP RetAddr  
00 2199e504 679b1eb9 dynamorio!os_dump_core_live_dump+0x26a [e:\b\build\slave\win-builder\drmemory\dynamorio\core\win32\os.c @ 7064]
01 2199e520 679a3a62 dynamorio!os_dump_core_live+0x39 [e:\b\build\slave\win-builder\drmemory\dynamorio\core\win32\os.c @ 7312]
02 2199e534 738479a6 dynamorio!dr_create_memory_dump+0x22 [e:\b\build\slave\win-builder\drmemory\dynamorio\core\x86\instrument.c @ 2268]
03 2199e924 73849a39 drmemorylib!report_core_dump+0x216 [e:\b\build\slave\win-builder\drmemory\drmemory\report.c @ 2620]
04 2199e9b4 7384a0f3 drmemorylib!report_error+0x369 [e:\b\build\slave\win-builder\drmemory\drmemory\report.c @ 2804]
05 2199eb04 73809c1f drmemorylib!report_unaddressable_access+0x103 [e:\b\build\slave\win-builder\drmemory\drmemory\report.c @ 3025]
06 2199eb74 7380a3c0 drmemorylib!handle_mem_ref_internal+0x61f [e:\b\build\slave\win-builder\drmemory\drmemory\readwrite.c @ 4882]
07 2199ebac 7380b39a drmemorylib!check_mem_opnd+0x760 [e:\b\build\slave\win-builder\drmemory\drmemory\readwrite.c @ 5054]
08 2199ed44 7380b9e7 drmemorylib!slow_path_with_mc+0x63a [e:\b\build\slave\win-builder\drmemory\drmemory\readwrite.c @ 3486]
09 2199eeb0 21f5003a drmemorylib!slow_path+0x47 [e:\b\build\slave\win-builder\drmemory\drmemory\readwrite.c @ 3706]
WARNING: Frame IP not in any known module. Following frames may be wrong.
0a 2199eee4 6795f1cf 0x21f5003a
0b 2199eef4 679609db dynamorio!heap_free+0x3f [e:\b\build\slave\win-builder\drmemory\dynamorio\core\heap.c @ 3710]
0c 2199ef18 67994adf dynamorio!instrlist_clear_and_destroy+0x1b [e:\b\build\slave\win-builder\drmemory\dynamorio\core\instrlist.c @ 106]
0d 2199efa0 00000000 dynamorio!build_basic_block_fragment+0x3ff [e:\b\build\slave\win-builder\drmemory\dynamorio\core\x86\interp.c @ 4703]

0:000> kn
  **\* Stack trace for last set context - .thread/.cxr resets it
 # ChildEBP RetAddr  
00 00000246 00000000 dynamorio!dispatch_enter_fcache+0x1f4 [e:\b\build\slave\win-builder\drmemory\dynamorio\core\dispatch.c @ 488]
0:000> r
Last set context:
eax=2339e601 ebx=00000000 ecx=2339e6cc edx=2339e6cc esi=214d6d40 edi=0531f5e0
eip=67953864 esp=00000246 ebp=00000246 iopl=0         nv up ei pl nz na po nc
cs=0023  ss=002b  ds=0000  es=0000  fs=0000  gs=0000             efl=00000202
dynamorio!dispatch_enter_fcache+0x1f4:
67953864 5d              pop     ebp

So enter_fcache returned, and ebp is messed up which messes up esp: enter_fcache(dcontext, fcache_enter, FCACHE_ENTRY_PC(targetf)); ASSERT_NOT_REACHED();

CommandLine:  '"E:\b\build\slave\windows-browser-drm-full-2\build\src\build\Release\browser_tests.exe" --type=renderer --dom-automation --enable-logging=stderr --log-level=0 --no-sandbox --test-type=browser --enable-deferred-image-decoding --lang=en-US --force-fieldtrials=Prerender/PrerenderMulti/UMA-New-Install-Uniformity-Trial/Experiment/UMA-Session-Randomized-Uniformity-Trial-5-Percent/default/UMA-Uniformity-Trial-1-Percent/group_63/UMA-Uniformity-Trial-10-Percent/group_09/UMA-Uniformity-Trial-100-Percent/group_01/UMA-Uniformity-Trial-20-Percent/group_04/UMA-Uniformity-Trial-5-Percent/group_15/UMA-Uniformity-Trial-50-Percent/default/ --noerrdialogs --user-data-dir="C:\Users\CHROME~2\AppData\Local\Temp\scoped_dir856_6693\d856_10514" --extension-process --enable-threaded-compositing --enable-delegated-renderer --enable-impl-side-painting --channel="284.2.432510063\130175343" /prefetch:673131151'

Only 2 threads. Other thread is here -- but wrong syms!:

Here are the real symbols, but they seem suspect as this is not FtpBrowserTest!

$ bin/symquery.exe -e ../full-build-win32/browser_tests.exe -f -a 0x00dd84cc 0x00ddd932 0x00dda19f 0x00dde7a2 0x000684e2 0x019a8ac7
std::find_if<std::_Vector_iterator<std::_Vector_val<std::_Simple_types<testing::TestCase *> > >,testing::internal::TestCaseNameIs>+0x5c
e:\b\depot_tools\win_toolchain\vs2013_files\vc\include\algorithm:56+0x35
testing::internal::UnitTestImpl::GetTestCase+0xa2
e:\b\build\slave\drm-cr\build\src\testing\gtest\src\gtest.cc:3930+0x6c
testing::internal::UnitTestImpl::AddTestInfo+0x10f
e:\b\build\slave\drm-cr\build\src\testing\gtest\src\gtest-internal-inl.h:668+0x2b
testing::internal::MakeAndRegisterTestInfo+0x182
e:\b\build\slave\drm-cr\build\src\testing\gtest\src\gtest.cc:2140+0x0
`dynamic initializer for 'FtpBrowserTest_DirectoryListing_Test::test_info_''+0x42
e:\b\build\slave\drm-cr\build\src\chrome\browser\net\ftp_browsertest.cc:23+0x42
__tmainCRTStartup+0x9e
f:\dd\vctools\crt\crtw32\dllstuff\crtexe.c:550+0xf

the thread start addresses: bad one is in chrome_elf.dll (other is in browser_tests.exe): so not a nudge thread

unable to set thread start address to 681F53B0

0:000> ? 681F53B0-chrome_elf
Evaluate expression: 21424 = 000053b0
&#37; ~/drmemory/releases/DrMemory-Windows-1.6.1-2/bin/symquery.exe -e full-build-win32/chrome_elf.dll -f -a 0x53b0
google_breakpad::ExceptionHandler::ExceptionHandlerThreadMain+0x0
e:\b\build\slave\drm-cr\build\src\breakpad\src\client\windows\handler\exception_handler.cc:383+0x0

why are there so few threads? I thought renderers always had at least a dozen threads (xref issue #130 data). this must be early on, with the process just starting up? also in that issue #130 data: 2nd thread is in base.dll, not chrome_elf.

In a local run of UsbApiTest.ZeroLengthTransfer there are 2 renderer processes: one has --extension-process and the other doesn't. The ldmp is from the extension thread. For that one locally:

&#37; grep ^Thread DrMemory-browser_tests.exe.5980.000/global.5980.log 
Thread `#2` @0:00:14.774 #bbs=35507 start=0x0223ffc0  browser_tests.exe!sandbox::BrokerServicesBase::TargetEventsThread+0x0
Thread `#3` @0:00:15.652 #bbs=41763 start=0x71b3d990  base.dll!base::`anonymous namespace'::ThreadFunc+0x0
Thread`#4`@0:00:15.858 #bbs=42882 start=0x77b72e65  ntdll.dll!TppWaiterpThread+0x0
Thread`#5`@0:00:15.900 #bbs=43134 start=0x77b73e85  ntdll.dll!TppWorkerThread+0x0
Thread`#6`@0:00:21.596 #bbs=64926 start=0x6ef0bfb4  MSVCR120.dll!_threadstartex+0x0
Thread`#7`@0:00:21.605 #bbs=64975 start=0x6ef0bfb4  MSVCR120.dll!_threadstartex+0x0
Thread`#8`@0:00:21.611 #bbs=65024 start=0x6ef0bfb4  MSVCR120.dll!_threadstartex+0x0
Thread`#9`@0:00:21.617 #bbs=65073 start=0x6ef0bfb4  MSVCR120.dll!_threadstartex+0x0
Thread`#10`@0:00:21.623 #bbs=65101 start=0x6ef0bfb4  MSVCR120.dll!_threadstartex+0x0
Thread`#11`@0:00:22.397 #bbs=67684 start=0x71b3d990  base.dll!base::`anonymous namespace'::ThreadFunc+0x0
Thread `#12` @0:00:36.785 #bbs=154239 start=0x71b3d990  base.dll!base::`anonymous namespace'::ThreadFunc+0x0
Thread`#13` @0:01:01.742 #bbs=318511 start=0x111de189 

0:000> dt -b dynamorio!stats
&#37; grep -E 'single_stat_t|value' ~/x
   +0x2f4 num_threads_pair : _single_stat_t
      +0x034 value            : 0n2
   +0x32c peak_num_threads_pair : _single_stat_t
      +0x034 value            : 0n2
   +0x364 num_threads_created_pair : _single_stat_t
      +0x034 value            : 0n2
   +0x39c num_callbacks_pair : _single_stat_t
      +0x034 value            : 0n0
   +0x3d4 num_APCs_pair    : _single_stat_t
      +0x034 value            : 0n0
   +0x40c num_exceptions_pair : _single_stat_t
      +0x034 value            : 0n6
   +0x444 pre_syscall_pair : _single_stat_t
      +0x034 value            : 0n118
   ...

Original issue: http://code.google.com/p/dynamorio/issues/detail?id=1443

derekbruening commented 9 years ago

From bruen...@google.com on May 07, 2014 08:52:39

Solutions:

It seems impractical to figure out just from the thread context that it's waiting, as it can be in the vsyscall or ntdll. We could perhaps walk the callstack and look for DR but that seems fragile.

If we store info about the looping thread: where? DR may not be init so can't use heap. Could have static array or sthg and assume there are never more than a few of these at once. Thread id array perhaps, with atomic inc index into it when adding.

The takeover code iterates until fixed point which is as good as we'll get for handling races w/ new ones coming in. If one comes in after takeover but before fully init, should work fine.

There's still a race though: if a new thread hits the wait loop but hasn't yet put id into array, takeover will take it over and hit this self-interp bug. So it seems that we want a flag combined w/ a context check for being in dynamorio.dll.

derekbruening commented 9 years ago

From derek.br...@gmail.com on May 07, 2014 11:59:10

This issue was closed by revision r2686 .

Status: Fixed

derekbruening commented 9 years ago

From bruen...@google.com on May 14, 2014 16:02:57

May need to re-open this: http://build.chromium.org/p/chromium.fyi/builders/Windows%20Browser%20%28DrMemory%20full%29%20%281%29/builds/3967/steps/memory%20test%3A%20browser_tests/logs/stdio

UNINITIALIZED READ: reading register eax
# 0 dynamorio.dll!nt_create_event           [e:\b\build\slave\win-builder\drmemory\dynamorio\core\win32\ntdll.c:3716]
# 1 dynamorio.dll!mutex_wait_contended_lock [e:\b\build\slave\win-builder\drmemory\dynamorio\core\win32\os.c:8047]
# 2 dynamorio.dll!mutex_lock                [e:\b\build\slave\win-builder\drmemory\dynamorio\core\utils.c:886]
# 3 dynamorio.dll!dynamo_thread_init        [e:\b\build\slave\win-builder\drmemory\dynamorio\core\dynamo.c:2024]
# 4 dynamorio.dll!intercept_new_thread      [e:\b\build\slave\win-builder\drmemory\dynamorio\core\win32\callback.c:2988]
# 5 dynamorio.dll!intercept_ldr_init        [e:\b\build\slave\win-builder\drmemory\dynamorio\core\win32\callback.c:3238]
# 6 dynamorio.dll!interception_code_array
Note: @0:00:07.407 in thread 3888
Note: instruction: test   %eax %eax

The report came from the ContentAutofillDriverBrowserTest.TestPageNavigationHidingAutofillPopup test.

Status: Accepted

derekbruening commented 9 years ago

From bruen...@google.com on May 14, 2014 16:35:26

My theory is that a thread is pointing right at the start of LdrInitializeThunk which will not be caught by r2686 .

derekbruening commented 9 years ago

From derek.br...@gmail.com on May 14, 2014 20:25:52

This issue was closed by revision r2690 .

Status: Fixed

derekbruening commented 9 years ago

From bruen...@google.com on July 17, 2014 21:34:39

Re-opening as it seems there are still instances of this:

This report observed at http://build.chromium.org/p/chromium.memory.fyi/builders/Windows%20Browser%20%28DrMemory%20full%29%20%283%29/builds/173 didn't match any suppressions: Suppression (error hash=#129BAB70584A3771#): { UNINITIALIZED READ name= dynamorio.dll!nt_create_event dynamorio.dll!mutex_wait_contended_lock dynamorio.dll!mutex_lock dynamorio.dll!dynamo_thread_init dynamorio.dll!intercept_new_thread dynamorio.dll!intercept_ldr_init dynamorio.dll!interception_code_array }

This report observed at http://build.chromium.org/p/chromium.memory.fyi/builders/Windows%20Browser%20%28DrMemory%20full%29%20%283%29/builds/173 didn't match any suppressions: Suppression (error hash=#5554065D1FEBA5D6#): { UNADDRESSABLE ACCESS name= dynamorio.dll!dynamorio_syscall_wow64 dynamorio.dll!close_file dynamorio.dll!mutex_wait_contended_lock dynamorio.dll!mutex_lock dynamorio.dll!dynamo_thread_init dynamorio.dll!intercept_new_thread dynamorio.dll!intercept_ldr_init dynamorio.dll!interception_code_array }

Status: Started

derekbruening commented 4 years ago

Xref #2904