llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.3k stars 11.69k forks source link

[ASAN] [RFC] For Asan instrumented global, emit two symbols, one with actual size and other with instrumented size. #70684

Open skc7 opened 11 months ago

skc7 commented 11 months ago

Consider Below HIP sample test from ROCm repo upstream. Test case: module_api_global

Kernel : vcpy_kernel.cpp .... device float myDeviceGlobal; ....

Host : runKernel.cpp .... /LINE/ hipModuleGetGlobal((void**)&deviceGlobal, &deviceGlobalSize, Module, "myDeviceGlobal") ....

Issue: With asan enabled, myDeviceGlobal size is read as 32 bytes(actual + redzone), it should have been 4 bytes(float).

Here is the issue in detail: AddressSanitizer pass replaces the global variable with an instrumented global variable which will have additional redzone size included. This instrumented global will have the same name as actual global.

Symbol table entry for the instrumented global would report the padded size (actual size+ redzone size) but has the name of actual global variable.

AMD language runtimes provide queries for the size of device global symbols and functions to copy data to and from device global variables. Runtime gets the needed information form the ELF symbol table. So, when it querires the size of device global variable, it gets the padded size rather than actual size.

To fix this, https://github.com/llvm/llvm-project/pull/66666 was created, where actual size of global was emitted for the symbol. But @hctim pointed out that the approach would have issue with relinkers.

As a solution, we are proposing the below approach in #70166

  1. One symbol with actual global variable name and actual size.
  2. Other symbol will have different name and size of instrumented global variable.
llvmbot commented 11 months ago

@llvm/issue-subscribers-clang-codegen

Author: Chaitanya (skc7)

Asan instrumented global variable size includes actual size plus redzone size. Symbol table entry for the instrumented global would report the padded size (actual size+ redzone size) but has the name of actual global variable. AMD language runtimes provide queries for the size of device global symbols and functions to copy data to and from device global variables. Runtime gets the needed information form the ELF symbol table. So, when it querires the size of device global variable, it gets the padded size rather than actual size. To fix this, https://github.com/llvm/llvm-project/pull/66666 was created, where actual size of global was emitted for the symbol. But @hctim pointed out that the approach would have issue with relinkers. **As a solution, we are proposing the below approach in #70166** - Identify instrumented global variable by attaching "asan_instrumented" attribute to the global. (https://github.com/llvm/llvm-project/pull/68865 under review). - For the instrumented global, emit two symbols at the same offset but with different sizes. One with actual global variable name and actual size. Other symbol will have different name and size of instrumented variable.
AdvenamTacet commented 11 months ago

Hey, it would be nice to have an example showing current value and more detailed explanation of proposed changes. It's hard to understand the RFC without reading issues and PRs, and honestly I'm not sure what exactly you are suggesting.

Could you create snippets showing how it works now and how it would work after the change? And example showing when current values are problematic?

Also, I think RFC would gain much more traction on LLVM discourse.