iree-org / iree

A retargetable MLIR-based machine learning compiler and runtime toolkit.
http://iree.dev/
Apache License 2.0
2.47k stars 548 forks source link

Convert stack allocas to memories. #17685

Open lialan opened 1 week ago

lialan commented 1 hour ago

So far hitting an issue in the VM execution. To be specific, look at this particular minimal test dump diff before and after this PR:

Before

%7 = llvm.alloca %6 x f32 {alignment = 64 : i64} : (i64) -> !llvm.ptr
...
llvm.store %24, %7 : f32, !llvm.ptr
...
%33 = llvm.load %7 : !llvm.ptr -> f32
...
llvm.store %39, %7 : f32, !llvm.ptr

After

%7 = llvm.load %arg2 : !llvm.ptr -> !llvm.struct<"iree_hal_executable_workgroup_state_v0_t", (i32, i32, i16, i16, i32, ptr, i32)>
%8 = llvm.extractvalue %7[5] : !llvm.struct<"iree_hal_executable_workgroup_state_v0_t", (i32, i32, i16, i16, i32, ptr, i32)>
...
llvm.store %25, %8 : f32, !llvm.ptr
...
%34 = llvm.load %8 : !llvm.ptr -> f32
...
llvm.store %40, %8 : f32, !llvm.ptr

Notice that the alloca is moved to the beginning address of 6th element in the workgroup state, which is a pointer to local memory: https://github.com/iree-org/iree/blob/9da0309b0491df57629a2177ab1dbec4aa73ae6e/runtime/src/iree/hal/local/executable_library.h#L346

According to the comments, it is possible that the local memory allocation is non-existent (rendering nullptr in this case), or the size is smaller than we expect it to be. Those information needs to be queried at runtime.

@benvanik question: is there a way to determine: whether we will allocate local memory, and the size of local memory at the compilation time? specifically, this happens inside ConvertToLLVM pass.