DynamoRIO / drmemory

Memory Debugger for Windows, Linux, Mac, and Android
Other
2.46k stars 263 forks source link

ASSERT FAILURE common\alloc_replace.3153: found (inconsistent default Heap) #1882

Open zhaoqin opened 8 years ago

zhaoqin commented 8 years ago

When run 64-bit handle.exe in Windows cmd, an ASSERT Failure happens at exit.

.\bin64\drmemory.exe -debug -verbose 4 -- .\tests\handle.exe
...
~~Dr.M~~ ASSERT FAILURE (thread 12832): ...Workspace\DrMemory\drmemory.git\common\alloc_replace.c:3153: found (inconsistent default Heap)

It does not happen on cygwin.

zhaoqin commented 8 years ago

Attaching windbg, the callstack:

0:000> kp
Child-SP          RetAddr           Call Site
00000000`bfa4c298 00000000`15412506 ntdll!ZwRaiseHardError+0xa
00000000`bfa4c2a0 00000000`152ac2a6 dynamorio!nt_messagebox(wchar_t * msg = 0x00000000`bfa4cb70 "ASSERT FAILURE (thread 12832): D:\src\cygwin\home\zhaoqin\Workspace\DrMemory\drmemory.git\common\alloc_replace.c:3153: found (inconsistent default Heap)", wchar_t * title = 0x00000000`1558e560 "Dr. Memory Notice: D:\src\cygwin\home\zhaoqin\Workspace\DrMemory\builds\build_x64_dbg.git\tests\handle.exe(45944)")+0x176 [d:\src\cygwin\home\zhaoqin\workspace\drmemory\drmemory.git\dynamorio\core\win32\ntdll.c @ 3805]
00000000`bfa4c340 00000000`bf2e1a65 dynamorio!dr_messagebox(char * fmt = 0x00000000`bf6c8c88 "ASSERT FAILURE (thread %I64d): %s:%d: %s (%s)")+0x126 [d:\src\cygwin\home\zhaoqin\workspace\drmemory\drmemory.git\dynamorio\core\lib\instrument.c @ 4211]
00000000`bfa4db90 00000000`bf2efb78 drmemorylib!destroy_Rtl_heap(struct _arena_header_t * arena = 0x00000000`00021000, struct _dr_mcontext_t * mc = 0x00000000`00000000, char free_chunks = 0n0 '')+0x695 [d:\src\cygwin\home\zhaoqin\workspace\drmemory\drmemory.git\common\alloc_replace.c @ 3153]
00000000`bfa4dd30 00000000`bf274e5f drmemorylib!alloc_replace_exit(void)+0x16f8 [d:\src\cygwin\home\zhaoqin\workspace\drmemory\drmemory.git\common\alloc_replace.c @ 5438]
00000000`bfa4dfe0 00000000`bf32c7f6 drmemorylib!alloc_exit(void)+0x5f [d:\src\cygwin\home\zhaoqin\workspace\drmemory\drmemory.git\common\alloc.c @ 3146]

From the log, we can see the process heap is process heap=0x00000000003f0000. However, at exit, in destroy_Rtl_heap, the heap becomes 0x21000

0:000> dt arena
Local var @ 0xbfa4dd30 Type _arena_header_t*
0x00000000`00021000 
   +0x000 start_chunk      : 0x687a5c65`6d6f685c  "--- memory read error at address 0x687a5c65`6d6f685c ---"
   +0x008 next_chunk       : 0x6f775c6e`69716f61  "--- memory read error at address 0x6f775c6e`69716f61 ---"
   +0x010 commit_end       : 0x5c656361`70736b72  "--- memory read error at address 0x5c656361`70736b72 ---"
   +0x018 reserve_end      : 0x79726f6d`656d7264  "--- memory read error at address 0x79726f6d`656d7264 ---"
   +0x020 free_list        : 0x726f6d65`6d72645c _free_lists_t
   +0x028 dr_lock          : 0x65745c74`69672e79 Void
   +0x030 lock             : 0x646e6168`5c737473 Void
   +0x038 flags            : 0x632e656c
   +0x03c prev_free_sz     : 0x333a7070
   +0x040 magic            : 0xd5d3332
   +0x048 alloc_set_member : 0x00002525`f6000ef8  "--- memory read error at address 0x00002525`f6000ef8 ---"
   +0x050 modbase          : 0x00000000`00020158  "???"
   +0x058 handle           : 0x00000000`00020158 Void
   +0x060 next_arena       : 0x20202020`20202020 _arena_header_t

In caller alloc_replace_exit,

    /* Free any pre-us heaps that are still around */
    for (i = 0; i < HASHTABLE_SIZE(crtheap_handle_table.table_bits); i++) {
        hash_entry_t *he, *next;
        for (he = crtheap_handle_table.table[i]; he != NULL; he = next) {
            arena_header_t *arena = (arena_header_t *) he->payload;
            next = he->next;
            destroy_Rtl_heap(arena, NULL, false/*do not free indiv chunks*/);
        }
    }
0:000> dv
             he = 0x00000000`bfbf3130
           next = 0x00000000`bfbf3148
              i = 0
0:000> dt he
Local var @ 0xbfa4dd98 Type _hash_entry_t*
0x00000000`bfbf3130 
   +0x000 key              : 0x00000000`00020000 Void
   +0x008 payload          : 0x00000000`00021000 Void
   +0x010 next             : 0x00000000`bfbf3148 _hash_entry_t
0:000> dt next
Local var @ 0xbfa4dda0 Type _hash_entry_t*
0x00000000`bfbf3148 
   +0x000 key              : 0x00000000`00010000 Void
   +0x008 payload          : 0x00000000`00011000 Void
   +0x010 next             : (null) 
0:000> ?? crtheap_handle_table.table[0]
struct _hash_entry_t * 0x00000000`bfbf3130
   +0x000 key              : 0x00000000`00020000 Void
   +0x008 payload          : 0x00000000`00021000 Void
   +0x010 next             : 0x00000000`bfbf3148 _hash_entry_t
zhaoqin commented 8 years ago

This seems a transparency issue. The arena 0x21000 is first initialized, but later is corrupted at:

0:000> kp
Child-SP          RetAddr           Call Site
00000000`bfd5ce18 00000000`01d712fa ntdll!memcpy+0xda
00000000`bfd5ce20 00000000`01d712a3 KERNEL32_1d50000!ConsoleCaptureMessageBuffer+0x3a
00000000`bfd5ce50 00000000`01d7203f KERNEL32_1d50000!WriteConsoleInternal+0x145
00000000`bfd5cf60 00000000`bf6bf5dc KERNEL32_1d50000!WriteFileImplementation+0x45
00000000`bfd5cfa0 00000000`bf677294 drmemorylib!print_to_cmd(char * buf = 0x00000000`c0a4adc0 "~~Dr.M~~ ..~~Dr.M~~ Error #1: HANDLE LEAK: KERNEL handle ...
00000000`bfd5d000 00000000`bf676a75 drmemorylib!report_error_from_buffer(void * f = 0x00000000`0000000b, char * buf = 0x00000000`bfdf4eb0  "..."

The memcpy code:

00000000`76fba8c2 488941f0        mov     qword ptr [rcx-10h],rax
00000000`76fba8c6 4c8951f8        mov     qword ptr [rcx-8],r10
00000000`76fba8ca 75d4            jne     ntdll!memcpy+0xb0 (00000000`76fba8a0) [br=1]
0:000> ? rax
Evaluate expression: 7593470431804269344 = 69616d20`34202320
0:000> ? rcx
Evaluate expression: 135256 = 00000000`00021058
0:000> ? r10
Evaluate expression: 2314885530818453614 = 20202020`2020206e
0:000> dd 21000
00000000`00021000  64645c3a 7463765c 736c6f6f 7472635c
00000000`00021010  646c625f 6c65735c 34365f66 646d615f
00000000`00021020  635c3436 735c7472 6f5c6372 6978656e
00000000`00021030  3a632e74 5d323631 7e7e0a0d 4d2e7244
00000000`00021040  4e207e7e 3a65746f 34202320 69616d20
00000000`00021050  2020206e 20202020 00020000 00000000

So we either mis-identify 0x21000 as heap, or the heap is corrupted by ConsoleCaptureMessageBuffer

zhaoqin commented 8 years ago

It seems most likely print_to_cmd calls to WriteFile, where WriteFileImplementation allocates some memory from heap starting at 0x20000 and corrupted with DrM's replaced malloc.

derekbruening commented 8 years ago

Are you sure this is related to the process heap? Is this the CsrPortHeap shared memory heap instead? Xref #1221, #1823, #1829. Is moving to the next page not enough? Was there a pre-existing alloc, or did we compute the end incorrectly?

derekbruening commented 8 years ago

Being CsrPortHeap would explain why this only happens in a cmd window where there's console activity talking to csrss.

derekbruening commented 8 years ago

When I move 2 pages away for our start, the handle.exe assert goes away. Does that fix handle.exe and calc.exe problems on your end?

I see these initial allocs:

There seem to be 2 mmapped heaps:

walking individual heap 0x0000000000010000
heap 2 0x0000000000010000-0x0000000000010000-0x0000000000010a80 0 0x0000000000010aa0,0x0000000000020000 a80
 0 0
heap 0 0x0000000000010aa0-0x0000000000010ac0-0x0000000000010fe0 0 0x0000000000010aa0,0x0000000000020000 520
 0 0
heap 100 0x0000000000011000-0x0000000000011000-0x0000000000020000 0 0x0000000000010000,0x0000000000020000 f
000 0 0
new arena inside mmapped pre-us Heap 0x0000000000010000 is 0x0000000000012000-0x0000000000020000-0x0000000000020000
walking heap 2 0x0000000000020000
walking individual heap 0x0000000000020000
heap 2 0x0000000000020000-0x0000000000020000-0x0000000000020a80 0 0x0000000000020aa0,0x0000000000030000 a80 0 0
heap 0 0x0000000000020aa0-0x0000000000020ac0-0x000000000002ffe0 0 0x0000000000020aa0,0x0000000000030000 f520 0 0
heap 100 0x0000000000030000-0x0000000000030000-0x0000000000030000 0 0x0000000000020000,0x0000000000030000 0 0 0
new arena inside mmapped pre-us Heap 0x0000000000020000 is 0x0000000000022000-0x0000000000030000-0x0000000000030000

destroy_Rtl_heap heap=0x0000000000021000
ASSERT FAILURE (thread 3576): D:\derek\drmemory\git\src\common\alloc_replace.c:3153: found (inconsistent default Heap)

Perhaps one of those allocs is open-ended and we have the end point computed incorrectly or sthg?

derekbruening commented 8 years ago

I added an extra page in 19baf9e Leaving open for better understanding

zhaoqin commented 8 years ago

The latest ToT runs fine without assert.