intel / mlir-extensions

Intel® Extension for MLIR. A staging ground for MLIR dialects and tools for Intel devices using the MLIR toolchain.
Other
126 stars 45 forks source link

The GPUBarrierConversion is not align to the SPIRV memory model #649

Open chengjunlu opened 1 year ago

chengjunlu commented 1 year ago

The MLIR GPUBarrierConversion is not align to the SPIRV memory model.

https://github.com/llvm/llvm-project/blob/592199c8fe58b2b102b3ca4019d17e25f8090be4/mlir/lib/Conversion/GPUToSPIRV/GPUToSPIRV.cpp#L413

https://www.khronos.org/blog/comparing-the-vulkan-spir-v-memory-model-to-cs

Please help to double confirm whether the logic is consistent in lowering the mlir::gpu::BarrierOp to SPIRV's barrier op.

Jianhui-Li commented 1 year ago

The description is unclear. What exactly you mean that the LLVM GPUBarrierConversion? How it is not aligned to SPIRV memory model?

chengjunlu commented 1 year ago

Justification C++ evolved on systems with coherent CPU caches, and thus it is natural for cache maintenance operations not to be explicitly exposed. When C++ is compiled to a system with noncoherent caches, the appropriate cache maintenance operations can be folded into release and acquire operations.

Vulkan and SPIR-V evolved on systems with many different execution units where each execution unit’s cache may not be coherent with any others, and thus exposing cache maintenance is a natural part of the API/language and can be a source of optimizations.

Without explicit availability and visibility operations, hazards that don’t need certain cache maintenance operations (e.g. Write-after-Write needs none) would still be bound to perform them, which would incur a performance penalty.

image

I assume the GPU barrier has the semantic of acquire-release ordering in memory too.

Jianhui-Li commented 1 year ago

The spirv barrier from reduction example has the following form : OpControlBarrier %uint_2 %uint_2 %uint_264

264 is 0x108, which translated to acquire-release + workgroup scope, as defined by the spec:

https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#_memory_semantics_id

Is that what you ask for?

chengjunlu commented 1 year ago

From the doc, it seems the VulKan/SPIR-V requiring the MakeAvailable | MakeVisible | AcquireRelease.

I am not sure whether the GPU barrier should follow this.