Open ScottTodd opened 1 year ago
Found some docs:
Alignment could be a red herring / unrelated to the memory corruption that is affecting buffer handles though.
Some of the alignment warnings go away if I switch from iree_allocator_malloc
to iree_allocator_malloc_aligned
.
I might try replacing all mallocs with malloc_aligned, since my application code can't control code like
iree/runtime/src/iree/vm/stack.c:575:17: runtime error: member access within misaligned address 0x005c181c for type 'iree_vm_stack_frame_header_t' (aka 'struct iree_vm_stack_frame_header_t'), which requires 8 byte alignment
I'm trying to get more programs running through WebGPU on the
webgpu
branch with the application code atexperimental/web/sample_webgpu
and the latest errors appear to be related to memory-corruption.Symptoms
The specific symptom I'm seeing is with programs that have more than one output, buffer handles are incorrect during
iree_hal_create_transfer_command_buffer
, triggering an assert:Assert callstack
``` iree_api_webgpu.js:76 (C) Aborted(Assertion failed) printErr @ iree_api_webgpu.js:76 abort @ web-sample-webgpu.js:899 assert @ web-sample-webgpu.js:465 Manager.get @ web-sample-webgpu.js:1425 _wgpuCommandEncoderCopyBufferToBuffer @ web-sample-webgpu.js:1774 imports.This program works as expected:
This program crashes there:
Debugging
The WebGPU HAL and web/Emscripten runtime code are both still works in progress, so I'm not ruling out some configuration or code issue there. The WebGPU application code (
sample_webgpu/main.c
) is new too, and I've been iterating on it.Sanitizers can be useful for these sorts of issues, so I'd like to use them if possible: https://emscripten.org/docs/debugging/Sanitizers.html
ASan
heap-buffer-overflow
atiree_hal_create_transfer_command_buffer
(same location as the Emscripten assert):ASan logs
``` (C) ================================================================= iree_api_webgpu.js:76 (C) ==42==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x14c00ec4 at pc 0x001cf7b8 bp 0x12e66510 sp 0x12e6651c iree_api_webgpu.js:76 (C) READ of size 4 at 0x14c00ec4 thread T0 iree_api_webgpu.js:76 (C) #0 0x1cf7b8 in __asan_report_load4+0x1cf7b8 (http://localhost:8000/web-sample-webgpu.wasm+0x1cf7b8) iree_api_webgpu.js:76 (C) iree_api_webgpu.js:76 (C) 0x14c00ec4 is located 4 bytes to the right of 48-byte region [0x14c00e90,0x14c00ec0) iree_api_webgpu.js:76 (C) allocated by thread T0 here: iree_api_webgpu.js:76 (C) #0 0x1f6e1e in __sanitizer::StackTrace::GetCurrentPc()+0x1f6e1e (http://localhost:8000/web-sample-webgpu.wasm+0x1f6e1e) iree_api_webgpu.js:76 (C) #1 0x1c77df in calloc+0x1c77df (http://localhost:8000/web-sample-webgpu.wasm+0x1c77df) iree_api_webgpu.js:76 (C) #2 0x16f349 in iree_allocator_system_ctl+0x16f349 (http://localhost:8000/web-sample-webgpu.wasm+0x16f349) iree_api_webgpu.js:76 (C) #3 0x16e208 in iree_allocator_malloc+0x16e208 (http://localhost:8000/web-sample-webgpu.wasm+0x16e208) iree_api_webgpu.js:76 (C) #4 0x141b53 in iree_hal_buffer_subspan+0x141b53 (http://localhost:8000/web-sample-webgpu.wasm+0x141b53) iree_api_webgpu.js:76 (C) #5 0x37729 in iree_hal_module_buffer_view_create+0x37729 (http://localhost:8000/web-sample-webgpu.wasm+0x37729) iree_api_webgpu.js:76 (C) #6 0x1218c1 in iree_vm_shim_rIIiiCID_r+0x1218c1 (http://localhost:8000/web-sample-webgpu.wasm+0x1218c1) iree_api_webgpu.js:76 (C) iree_api_webgpu.js:76 (C) SUMMARY: AddressSanitizer: heap-buffer-overflow (http://localhost:8000/web-sample-webgpu.wasm+0x1cf7b7) in __asan_report_load4+0x1cf7b7 iree_api_webgpu.js:76 (C) Shadow bytes around the buggy address: iree_api_webgpu.js:76 (C) 0x02980180: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa iree_api_webgpu.js:76 (C) 0x02980190: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa iree_api_webgpu.js:76 (C) 0x029801a0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa iree_api_webgpu.js:76 (C) 0x029801b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa iree_api_webgpu.js:76 (C) 0x029801c0: fa fa fa fa fa fa fa fa fa fa 00 00 00 00 00 00 iree_api_webgpu.js:76 (C) =>0x029801d0: fa fa 00 00 00 00 00 00[fa]fa 00 00 00 00 04 fa iree_api_webgpu.js:76 (C) 0x029801e0: fa fa 00 00 00 00 04 fa fa fa 00 00 00 00 00 04 iree_api_webgpu.js:76 (C) 0x029801f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa iree_api_webgpu.js:76 (C) 0x02980200: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa iree_api_webgpu.js:76 (C) 0x02980210: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa iree_api_webgpu.js:76 (C) 0x02980220: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa iree_api_webgpu.js:76 (C) Shadow byte legend (one shadow byte represents 8 application bytes): iree_api_webgpu.js:76 (C) Addressable: 00 iree_api_webgpu.js:76 (C) Partially addressable: 01 02 03 04 05 06 07 iree_api_webgpu.js:76 (C) Heap left redzone: fa iree_api_webgpu.js:76 (C) Freed heap region: fd iree_api_webgpu.js:76 (C) Stack left redzone: f1 iree_api_webgpu.js:76 (C) Stack mid redzone: f2 iree_api_webgpu.js:76 (C) Stack right redzone: f3 iree_api_webgpu.js:76 (C) Stack after return: f5 iree_api_webgpu.js:76 (C) Stack use after scope: f8 iree_api_webgpu.js:76 (C) Global redzone: f9 iree_api_webgpu.js:76 (C) Global init order: f6 iree_api_webgpu.js:76 (C) Poisoned by user: f7 iree_api_webgpu.js:76 (C) Container overflow: fc iree_api_webgpu.js:76 (C) Array cookie: ac iree_api_webgpu.js:76 (C) Intra object redzone: bb iree_api_webgpu.js:76 (C) ASan internal: fe iree_api_webgpu.js:76 (C) Left alloca redzone: ca iree_api_webgpu.js:76 (C) Right alloca redzone: cb iree_api_webgpu.js:76 (C) ==42==ABORTING ?function=multiple_r…lts_webgpu.vmfb:315 Function call error: '[object Object]' ```UBSan and
-sSAFE_HEAP
More logs
``` iree_api_webgpu.js:76 (C) D:/dev/projects/iree/runtime/src/iree/hal/drivers/webgpu/webgpu_device.c:136:34: runtime error: member access within misaligned address 0x005a24e8 for type 'iree_hal_webgpu_device_t' (aka 'struct iree_hal_webgpu_device_t'), which requires 16 byte alignment iree_api_webgpu.js:76 (C) 0x005a24e8: note: pointer points here iree_api_webgpu.js:76 (C) 4b 12 02 00 01 00 00 00 f4 7f 02 00 00 00 00 00 05 00 00 00 18 37 5c 00 11 00 00 00 00 00 00 00 iree_api_webgpu.js:76 (C) ^ iree_api_webgpu.js:76 (C) D:/dev/projects/iree/runtime/src/iree/hal/drivers/webgpu/webgpu_device.c:136:59: runtime error: member access within misaligned address 0x005a24e8 for type 'iree_hal_webgpu_device_t' (aka 'struct iree_hal_webgpu_device_t'), which requires 16 byte alignment iree_api_webgpu.js:76 (C) 0x005a24e8: note: pointer points here iree_api_webgpu.js:76 (C) 4b 12 02 00 01 00 00 00 f4 7f 02 00 00 00 00 00 05 00 00 00 18 37 5c 00 11 00 00 00 00 00 00 00 iree_api_webgpu.js:76 (C) ^ ?function=multiple_r…lts_webgpu.vmfb:187 IREE initialized, ready to load programs. iree_api_webgpu.js:76 (C) D:/dev/projects/iree/runtime/src/iree/hal/drivers/webgpu/webgpu_device.c:270:15: runtime error: member access within misaligned address 0x005a24e8 for type 'iree_hal_webgpu_device_t' (aka 'struct iree_hal_webgpu_device_t'), which requires 16 byte alignment iree_api_webgpu.js:76 (C) 0x005a24e8: note: pointer points here iree_api_webgpu.js:76 (C) 4b 12 02 00 03 00 00 00 f4 7f 02 00 00 00 00 00 05 00 00 00 18 37 5c 00 11 00 00 00 00 00 00 00 iree_api_webgpu.js:76 (C) ^ iree_api_webgpu.js:76 (C) D:/dev/projects/iree/runtime/src/iree/hal/drivers/webgpu/webgpu_device.c:270:49: runtime error: member access within misaligned address 0x005a24e8 for type 'iree_hal_webgpu_device_t' (aka 'struct iree_hal_webgpu_device_t'), which requires 16 byte alignment iree_api_webgpu.js:76 (C) 0x005a24e8: note: pointer points here iree_api_webgpu.js:76 (C) 4b 12 02 00 03 00 00 00 f4 7f 02 00 00 00 00 00 05 00 00 00 18 37 5c 00 11 00 00 00 00 00 00 00 iree_api_webgpu.js:76 (C) ^ iree_api_webgpu.js:76 (C) D:/dev/projects/iree/runtime/src/iree/vm/bytecode/verifier.c:1127:5: runtime error: load of misaligned address 0x005c4472 for type 'const uint32_t' (aka 'const unsigned int'), which requires 4 byte alignment iree_api_webgpu.js:76 (C) 0x005c4472: note: pointer points here iree_api_webgpu.js:76 (C) 00 00 79 0d 02 00 00 00 00 00 10 02 80 0d 0d 00 00 00 01 00 0d 1c 00 00 00 02 00 0d 03 00 00 00 iree_api_webgpu.js:76 (C) ^ iree_api_webgpu.js:76 (C) D:/dev/projects/iree/runtime/src/iree/vm/bytecode/verifier.c:1141:5: runtime error: load of misaligned address 0x005c4479 for type 'const uint16_t' (aka 'const unsigned short'), which requires 2 byte alignment iree_api_webgpu.js:76 (C) 0x005c4479: note: pointer points here iree_api_webgpu.js:76 (C) 00 00 00 10 02 80 0d 0d 00 00 00 01 00 0d 1c 00 00 00 02 00 0d 03 00 00 00 03 00 0d 11 00 00 00 iree_api_webgpu.js:76 (C) ^ iree_api_webgpu.js:76 (C) Aborted(alignment fault) web-sample-webgpu.js:750 Uncaught RuntimeError: Aborted(alignment fault) at abort (web-sample-webgpu.js:750:10) at alignfault (web-sample-webgpu.js:429:2) at imports.Fixes attempted