Closed dkolsen-pgi closed 3 months ago
After inlining and constant folding optimizations have happened, the switch statement can be optimized away
Yea, I've seen many cases internally where even after the opt pipeline does its job we might still have the indirection layer.
Regarding the overall approach: it might be worth trying to change existing atomic operations to accept a "value or mem_order" (we have something similar with calls, for supporting both indirect/direct within the same op, LLVMIR dialect also has examples) and only break this into a switch in LoweringPrepare - this leave some room for CIR passes being able to constant propagate / idiom recognize before we expand it, and then future fold methods could be able to change to the "mem_order" version, avoiding the early expansion.
it might be worth trying to change existing atomic operations to accept a "value or mem_order"
I already have it implemented doing the expansion during CodeGen. I am cleaning up the change and getting ready to commit it. I don't want to re-implement it right now based on speculation that it might be better in the long run. I do like the idea and it has merit, but I think we should revisit it when the ClangIR optimizations and transformations are further along.
I think it has merit even if we don't consider the opt speculation, because it keeps CIR simple to read/reason at the first level out of CIRGen, but at this point I do care more about getting this down to LLVM and the incremental approach sounds good - once you put the PR up I'll file another issue to track this type of extra abstraction (we might find volunteers to do tackle it).
RFE: Support atomic built-ins with memory order arguments that are runtime values rather than compile-time values.
When the compiler sees
result = __atomic_fetch_add(ptr, val, order);
where the value oforder
is not a constant, then CodeGen must generate CIR similar to this:While this seems wasteful an unnecessary at first, it is needed to support
std::atomic
, which is an extremely important use case. Becausestd::atomic
puts an extra function in between the user and the atomic built-in, the memory order looks like a runtime value even though it is a constant in the source code. For example, the user might writeWhile the memory order is a constant in the user code, inside the definition of
std::atomic<T>::fetch_add
the memory order is just a function parameter whose value isn't known at compile-time, at least during CodeGen. After inlining and constant folding optimizations have happened, the switch statement can be optimized away. But it needs to exist during CodeGen and will survive all the way through the compilation if no optimization happens.