Closed derekbruening closed 8 years ago
From bruen...@google.com on July 21, 2014 15:42:15
Pasting my notes in FTR:
It seems that there is a bug in beyond-reservation reachability handling (xref issue #1133 where we think it should work and are only worried about OOM when it fails to find anything reachable b/c it's trying too hard). It sounds like the code that estimates whether we need to change a rip-rel instr to absolute through a register gets it wrong for the case where a code cache unit is outside of the initial reservation. By increasing the initial reservation to 512M you avoided that scenario.
Status: Accepted
Labels: -Priority-Medium Priority-High OpSys-x64 Bug-Assert
From bruen...@google.com on August 20, 2014 10:23:18
I wonder if you could provide us with a core dump. We have a custom core dump format. Could you run with the DR option "-dumpcore_mask 0x10" and then reproduce the encoding failure CLIENT_ASSERT? That should create a file with a name like "
We would also need the DR build and symbols.
Status: NeedInfo
Owner: bruen...@google.com
From kai.stam...@gmail.com on August 25, 2014 12:11:48
Hi, I made a dumpfile (4gb), but I'm not too comfortable sharing it (sorry for not being too helpful).
I dumped module addresses from the asserting application and made a little tool that virtualallocs in those ranges, puts some dummy code, and runs it. It also places the faulting riprel instruction in its proper place (and also runs it). While I'm not able to see the same assert, it reproducibly runs into "(decode) exception in last area, probably app race condition: dr pc=0x0000000071389552, app pc=0x0000000062aa9000", always after placing the riprel instruction. Maybe its related?
I use latest SVN dyn and run the repro.cpp (compiled as X64 DEBUG MTd) with: C:\dev\dyn\libs\dynamorio-read-only\build_debug\bin64\drrun.exe -debug -no_hide -disable_traces -- C:\Users\pro\Desktop\repro\repro\x64\Debug\repro.exe
Attachment: repro.cpp
From kai.stam...@gmail.com on October 01, 2014 12:16:22
I was able to reproduce it on UE4. I made a coredump and uploaded it together with dynamorio.dll/.pdb to: https://www.dropbox.com/s/jg8oykzuju8584d/logs.rar?dl=0
xref #1326
Pasting my prior notes on trying to repro:
I tried running the decode test, which has a rip-rel test.
1224 bin64/drrun -debug -vm_size 4M -- suite/tests/bin/common.decode
1225 bin64/drrun -debug -vm_size 1M -- suite/tests/bin/common.decode
1226 bin64/drrun -debug -vm_size 1M -no_enable_reset -- suite/tests/bin/common.decode
1227 bin64/drrun -debug -vm_size 1M -cache_bb_unit_init 256K -cache_bb_unit_max 256K -no_enable_reset -- suite/tests/bin/common.decode
1228 bin64/drrun -debug -vm_size 1M -cache_bb_unit_init 256K -cache_bb_unit_quadruple 256K -cache_bb_unit_max 256K -no_enable_reset -- suite/tests/bin/common.decode
1229 bin64/drrun -debug -vm_size 1M -cache_bb_unit_init 256K -cache_bb_unit_quadruple 256K -cache_bb_unit_upgrade 256K -cache_bb_unit_max 256K -no_enable_reset -- suite/tests/bin/common.decode
1230 bin64/drrun -debug -vm_size 1M -no_enable_reset -cache_bb_unit_init 256K -cache_bb_unit_quadruple 256K -cache_bb_unit_upgrade 256K -cache_bb_unit_max 256K -no_enable_reset -- suite/tests/bin/common.decode
1231 bin64/drrun -debug -vm_size 0 -no_enable_reset -cache_bb_unit_init 256K -cache_bb_unit_quadruple 256K -cache_bb_unit_upgrade 256K -cache_bb_unit_max 256K -no_enable_reset -- suite/tests/bin/common.decode
1232 bin64/drrun -debug -vm_size 16K -no_enable_reset -cache_bb_unit_init 256K -cache_bb_unit_quadruple 256K -cache_bb_unit_upgrade 256K -cache_bb_unit_max 256K -no_enable_reset -- suite/tests/bin/common.decode
1233 bin64/drrun -debug -vm_size 128K -no_enable_reset -cache_bb_unit_init 256K -cache_bb_unit_quadruple 256K -cache_bb_unit_upgrade 256K -cache_bb_unit_max 256K -no_enable_reset -- suite/tests/bin/common.decode
But no dice.
I think this repro requires very particular circumstances.
The request_region_be_heap_reachable() for beyond-VMM allocs must accumulate some error or sthg.
rel32_reachable_from_vmcode() just looks at the start and end of vmcode.
This happened once on the 64-bit drmemory tests:
http://dynamorio.org/CDash/testDetails.php?test=9004&build=449
STDOUT:
STDERR: WARNING: 64-bit non-pattern modes are experimental
<Starting application /work/drmemory/nightly/run/build_drmemory-dbg-64/tests/fuzz_buffer.cpp (13263)>
<Paste into GDB to debug DynamoRIO clients:
set confirm off
add-symbol-file '/work/drmemory/nightly/run/build_drmemory-dbg-64/bin64/debug/libdrmemorylib.so' 0x000000007381b000
add-symbol-file '/work/drmemory/nightly/run/build_drmemory-dbg-64/dynamorio/lib64/debug/libdynamorio.so' 0x000000007102c900
add-symbol-file '/usr/lib64/libstdc++.so.6' 0x00007f17d25b9fa0
add-symbol-file '/usr/lib64/libm.so.6' 0x00007f17d222c550
add-symbol-file '/usr/lib64/libc.so.6' 0x00007f17d1e854d0
add-symbol-file '/usr/lib64/ld-linux-x86-64.so.2' 0x00007f17d1c42ad0
add-symbol-file '/usr/lib64/libgcc_s.so.1' 0x00007f17d1a2caf0
>
<Initial options = -no_dynamic_options -logdir '/work/drmemory/nightly/run/build_drmemory-dbg-64/logs/dynamorio' -client_lib '/work/drmemory/nightly/run/build_drmemory-dbg-64/bin64/debug/libdrmemorylib.so;0;`-batch` `-callstack_style` `0x27` -no_results_to_stderr `-fuzz_target` `fuzz_buffer.cpp!_ZN13BufferPrinter8repeatmeEPjm|3|1|2|10` -logdir `/work/drmemory/nightly/run/build_drmemory-dbg-64/logs` -symcache_dir `/work/drmemory/nightly/run/build_drmemory-dbg-64/logs/symcache` -resfile 13263 ' -code_api -stack_size 56K -disable_traces -no_enable_traces -max_elide_jmp 0 -max_elide_call 0 -max_bb_instrs 256 -no_shared_traces -bb_ibl_targets -bb_single_restore_prefix -no_shared_trace_ibl_routine -no_enable_reset -no_reset_at_switch_to_os_at_vmm_limit -reset_at_vmm_percent_free_limit 0 -no_reset_at_vmm_full -reset_at_commit_free_limit 0K -reset_every_nth_pending 0 -vm_size 262144K -early_inject -emulate_brk -no_inline_ignored_syscalls -native_exec_default_list '' -no_native_exec_managed_code -no_indcall2direct >
~~Dr.M~~ WARNING: 64-bit non-pattern modes are experimental
~~Dr.M~~ Dr. Memory version 1.9.16734
~~Dr.M~~ Not tested - Save and restore shadow registers @/work/drmemory/nightly/src/drmemory/fuzzer.c:1421
~~Dr.M~~ Not tested - Registration of fuzz target on module load @/work/drmemory/nightly/src/drmemory/fuzzer.c:981
<(1+x) Handling our fault in a TRY at 0x000000007392a0e1>
<Application /work/drmemory/nightly/run/build_drmemory-dbg-64/tests/fuzz_buffer.cpp (13263) DynamoRIO usage error : encoding failed re-relativizing rip-relative address whose target is unreachable>
<Usage error: encoding failed re-relativizing rip-relative address whose target is unreachable (/work/drmemory/nightly/src/dynamorio/core/arch/x86/encode.c, line 2296)
1.9.16734-1-(Oct 27 2015 05:29:07)0
-no_dynamic_options -logdir '/work/drmemory/nightly/run/build_drmemory-dbg-64/logs/dynamorio' -client_lib '/work/drmemory/nightly/run/build_drmemory-dbg-64/bin64/debug/libdrmemorylib.so;0;`-batch` `-callstack_style` `0x27` -no_results_to_stderr `-fuzz_target` `fuzz_buffer.cpp!_ZN13BufferPrinter8repeatmeEPjm|3|1|2|10` -lo
0x0000000051371850 0x000056154eb2afb7
0x00000000513718a0 0x000056154ec8194c
0x0000000051371910 0x000056154ec63069
0x0000000051371970 0x000056154ec63fd7
0x00000000513719c0 0x000000007389c0aa
0x0000000051371e50 0x000000007382e96b
0x0000000051372130 0x00000000739a9c94
0x0000000051372990 0x000056154ec2553d
0x0000000051372a20 0x000056154ec93e7f
0x0000000051372ad0 0x000056154ec99498
0x0000000051372d30 0x000056154ec9de82
0x0000000051372f10 0x000056154eb15037
0x0000000051372ff0 0x0000000051326ecd
/work/drmemory/nightly/run/build_drmemory-dbg-64/bin64/debug/libdrmemorylib.so=0x0000000073800000
/usr/lib64/libstdc++.so.6=0x00007f17d2530000
/usr/lib64/libgcc_s.so.1=0x00007f17d1a2a000
/usr/lib64/libm.so.6=0x00007f17d2227000
/usr/lib64/libc.so.6=0x00007f17d1e66000
/usr/lib64/ld-linux-x86-64.so.2=0x00007f17d1c42000>
This also shows up when running 64-bt Cr unit_tests on Windows:
[ RUN ] ShortcutsProviderTest.Extension
<Application z:\derek\chromium\src\out\Release_x64\unit_tests.exe (7112) DynamoRIO usage error : encoding failed re-relativizing rip-relative address whose target is unreachable>
<Usage error: encoding failed re-relativizing rip-relative address whose target is unreachable (D:\derek\dr\git\src\core\arch\x86\encode.c, line 2296)
It is non-deterministic on unit_tests. I got an instance into windbg (-batch is what thwarted it the first time):
04 00000000`c70fd6b0 00000000`15325adf dynamorio!external_error+0x16c [c:\src\dr\git\src\core\utils.c @ 205]
05 00000000`c70fd730 00000000`15333b93 dynamorio!copy_and_re_relativize_raw_instr+0x5bf [c:\src\dr\git\src\core\arch\x86\encode.c @ 2296]
06 00000000`c70fd810 00000000`153250d4 dynamorio!instr_encode_arch+0xd3 [c:\src\dr\git\src\core\arch\x86\encode.c @ 2348]
07 00000000`c70fda70 00000000`1532507c dynamorio!instr_encode_to_copy+0x44 [c:\src\dr\git\src\core\arch\encode_shared.c @ 122]
08 00000000`c70fdac0 00000000`15171afb dynamorio!instr_encode+0x2c [c:\src\dr\git\src\core\arch\encode_shared.c @ 128]
09 00000000`c70fdaf0 00000000`15178ce8 dynamorio!set_linkstub_fields+0x31db [c:\src\dr\git\src\core\emit.c @ 370]
0a 00000000`c70fde00 00000000`1516e089 dynamorio!emit_fragment_common+0x6a68 [c:\src\dr\git\src\core\emit.c @ 669]
0b 00000000`c70feba0 00000000`153435f3 dynamorio!emit_fragment_ex+0x59 [c:\src\dr\git\src\core\emit.c @ 999]
0c 00000000`c70febf0 00000000`1515be65 dynamorio!build_basic_block_fragment+0x843 [c:\src\dr\git\src\core\arch\interp.c @ 5084]
0:000> .frame 5
05 00000000`c70fd730 00000000`15333b93 dynamorio!copy_and_re_relativize_raw_instr+0x5bf [c:\src\dr\git\src\core\arch\x86\encode.c @ 2296]
0:000> dv
addr32 = 0n0 ''
target = 0x00000000`76beda48 "???"
new_offs = 0n-2148291049
rip_rel_pos = 3
ok = 0n1 ''
dcontext = 0x00000000`c7099280
instr = 0x00000000`d3368f00
dst_pc = 0x00000000`f6cb2c2a "???"
final_pc = 0x00000000`f6cb2c2a "???"
orig_dst_pc = 0x00000000`f6cb2c2a "???"
0:000> U @@(start)
kernel32!GetLocaleInfoW+0x8:
00000000`76b68968 ff25da500800 jmp qword ptr [kernel32!UnhandledExceptionFilter+0x2218 (00000000`76beda48)]
The instr is a mov, the load into rcx for IBL. This cache unit is positioned to be just a little too far away.
rel32_reachable_from_vmcode(byte *tgt)
{
byte *vmcode_start = vmcode_get_start();
byte *vmcode_end = vmcode_get_end();
ptr_int_t new_offs = (tgt > vmcode_start) ? (tgt - vmcode_start) : (vmcode_end - tgt);
/* Beyond-vmm-reservation allocs are handled b/c those are subject to the
* reachability constraints we set up on every new reservation, including
* the initial vm_reserve.
*/
return REL32_REACHABLE_OFFS(new_offs);
0:000> ?? heapmgt->vmheap
struct vm_heap_t
+0x000 start_addr : 0x00000000`c7020000 "--- memory read error at address 0x00000000`c7020000 ---"
+0x008 end_addr : 0x00000000`d7020000 "--- memory read error at address 0x00000000`d7020000 ---"
0:000> ?? heapmgt->vmheap.end_addr - target
int64 0n1615013304
0:000> x dynamorio!heap_allowable*
00000000`15562398 dynamorio!heap_allowable_region_end = 0x00000001`4685ffff "--- memory read error at address 0x00000001`4685ffff ---"
00000000`1556e450 dynamorio!heap_allowable_region_start = 0x00000000`c685efff "--- memory read error at address 0x00000000`c685efff ---"
0:000> ?? dynamorio!heap_allowable_region_end - target
int64 0n3485935031
It seems like rel32_reachable_from_vmcode()'s comment is wrong: it needs to use the heap_allowableregion* bounds and not the vmcode bounds.
From kai.stam...@gmail.com on July 21, 2014 12:32:44
I'm using SVN dynamorio on NTAMD64 using DRRUN:
C:\dev\dyn\libs\dynamorio-read-only\build_debug\bin64\drrun.exe -debug -no_follow_children -disable_traces -syntax_intel -no_hide -max_bb_instrs 512 -- I:...application.exe
Not too long after application startup, it runs into: "encoding failed re-relativizing rip-relative address whose target is unreachable"
The crashing instruction is rip-rel:
This code is part of MSVCR100.DLL!_freefls, which is used during _threadstartex. The instructions themselves don't seem to have a problem, but their location / location of replacements.
I failed to make a stripped down version of the problem, but I'm able to reproduce it any time.
It works with -vm_size 512M
Original issue: http://code.google.com/p/dynamorio/issues/detail?id=1479