Open utterances-bot opened 2 years ago
“Changing MLIR IR itself needs to meet a very lofty bar nowadays because of its nature of center of gravity. All the tools are expected to process it, and there are so many transformations and different organizations' workflows go through it. Even if you don’t need to plumb through the full flow that frequency, tweaking a small aspect of the IR can still trigger a surprising ripple effect. So that naturally means changes are slow and requires extensive discussions and sign offs from many stakeholders. That is all necessary to guarantee the quality of MLIR IR; but if I just have a very isolated need, it would be quite hard to motivate a change and justify its landing.”
“MLIR” -> "LLVM" in this part?
@basicmi: Yes, good catch! I've fixed it. Thanks for pointing out!
Wonderful post, as ever, Lei! In terms of evolution, using LLVM for GPU compilation will require moving away from LLVM IR itself, I think. The tradeoffs and semantics start diverging quickly, and even different GPU-based IRs may/will have slightly different semantics. So, a container IR would be ideal to allow different middle/back-ends express their own semantic needs. Do you see MLIR as that container IR? How would different backends use MLIR to further describe their semantics (say SPIRV vs DXIL vs CUDA).
MLIR also suitable for circuits generation, which means program partially could be compiled into hardware. Or MLIR could be compiled to bunch of microservices and database structure. Don't stop your imagination.
Wonderful post, as ever, Lei! In terms of evolution, using LLVM for GPU compilation will require moving away from LLVM IR itself, I think. The tradeoffs and semantics start diverging quickly, and even different GPU-based IRs may/will have slightly different semantics. So, a container IR would be ideal to allow different middle/back-ends express their own semantic needs. Do you see MLIR as that container IR? How would different backends use MLIR to further describe their semantics (say SPIRV vs DXIL vs CUDA).
@dnovillo: Yeah I think MLIR can be the infrastructure for such purposes. It provides a consistent infra and scaffolding for different variants of GPU IRs. What op and what semantics to have is up to each GPU IR to define. Actually as of right now we already have multiple GPU related dialects, GPU dialect being vendor-agonistic common host/device abstractions, NVGPU dialect/AMDGPU dialect for NVIDIA/AMD-specific stuff, SPIR-V dialect for lowering and existing MLIR system, etc.
This post has been extremely helpful in understanding XLA and FREE, the ML compilers for TPU and Android/iOS, respectively. The best way to understand a field is to learn about how people thought and worked back then. The path from LLVM and SPIR to SPIR-V and MLIR clearly reveals the future direction. Thank you @antiagainst!
Beautiful post! I enjoyed the narrative / historical part (the unavoidable necessity, now more than ever, of having a new IR), as much as the philosophical digression.
You would change
"An IR can have three forms: an in-memory form for efficient analysis and transformation, an bytecode form for persistence and exchange, and a textual form for human inspection and debugging."
to
"An IR can have three forms: an in-memory form for efficient analysis and transformation, a bytecode form for persistence and exchange, and a textual form for human inspection and debugging."
mjs
@erl4ng: Thanks for pointing out! Fixed now. :)
Compilers and IRs: LLVM IR, SPIR-V, and MLIR | Lei.Chat()
Overall discussion on compilers and IRs (LLVM IR, SPIR-V, and MLIR): why they are in their current manner and how they would evolve
https://www.lei.chat/posts/compilers-and-irs-llvm-ir-spirv-and-mlir/?version=2.5.50001.158&platform=win