llvm / clangir

A new (MLIR based) high-level IR for clang.
https://clangir.org
Other
383 stars 100 forks source link

[CIR][CIRGen] Add complex type and its CIRGen support #513

Closed Lancern closed 4 months ago

Lancern commented 7 months ago

This PR adds !cir.complex type to model the _Complex type in C. It also contains support for its CIRGen.

In detail, this patch adds the following CIR types, ops, and attributes:

CIRGen support for some of the fundamental complex number operations is also included. ~Note the implementation diverges from the original clang CodeGen, where expressions of complex types are handled differently from scalars and aggregates. Instead, this patch treats expressions of complex types as scalars, as such expressions can be simply lowered to a CIR value of !cir.complex type.~

This PR addresses #445 .

Lancern commented 7 months ago

@bcardosolopes Hi Bruno, thanks for your review! Let me explain in more details about why I'm not following the skeleton here.

The original clang CodeGen divide expressions into three categories according to their types: 1) scalar, 2) complex, and 3) aggregate. CodeGen handles these three categories of expressions differently:

So the CodeGen of scalar expressions and complex expressions are actually very similar. I believe the reason why CodeGen distinguishes them is that CodeGen of scalar expressions yields llvm::Value * while CodeGen of complex expressions yields std::pair<llvm::Value *, llvm::Value *>.

OK, that's enough of background story. In CIR, what makes a difference is that instead of using something like std::pair<mlir::Value, mlir::Value> to represent a complex value, we now have the !cir.complex type which allows us to use a single mlir::Value to represent a complex value. In other words, in CIR, if we keep following the skeleton:

If we follow the skeleton, the code in CIRGenExprScalar.cpp and CIRGenExprComplex.cpp (which hasn't been invented) would be very very similar to each other, resulting in unnecessary redundancy. Thus I decided to treat expressions of complex types as scalar expressions in CIRGen. This eases the implementation greatly and it still follow the intuition.

I hope this explanation is clear enough for you! If you are still confused please let me know.

bcardosolopes commented 7 months ago

Thanks for the detailed explanation (very clarifying) and for thinking about code duplication!

If we follow the skeleton, the code in CIRGenExprScalar.cpp and CIRGenExprComplex.cpp (which hasn't been invented) would be very very similar to each other, resulting in unnecessary redundancy. Thus I decided to treat expressions of complex types as scalar expressions in CIRGen. This eases the implementation greatly and it still follow the intuition.

clang/lib/CodeGen/CGExprComplex.cpp is quite big. Are you confident that current missing complex codegen will just apply with all scalar stuff? I have the feeling that as soon as something differs, we'll need to change scalar emitters to take complex types into account, and because we're assuming scalar handles it all, we won't have asserts to catch what's not currently implemented - so might end up with bad codegen which will take time to investigate and debug.

I believe duplication will lead us to converge faster and have less issues. How about you add a FIXME(cir): unlike traditional CodeGen, this shares common code with ScalarExprEmitter to every method that is the same between ComplexExprEmitter and ScalarExprEmitter, so that when we complete complex support, we can decide if there's a refactoring approach we could take?

Lancern commented 7 months ago

How about you add a FIXME(cir): unlike traditional CodeGen, this shares common code with ScalarExprEmitter to every method that is the same between ComplexExprEmitter and ScalarExprEmitter, so that when we complete complex support, we can decide if there's a refactoring approach we could take?

OK I believe this is also acceptable and it allows us to evaluate this divergence more deeply before we can make further decisions. I'll move the complex CIRGen code to a new CIRGenComplex.cpp file and keep the upstream code structure for now.

Lancern commented 7 months ago

Rebased onto the latest main, but not yet started to work on the reviews.

bcardosolopes commented 7 months ago

@Lancern sorry, I was out on vacation. Let me know when you address all reviews and update the PR.

Lancern commented 7 months ago

@bcardosolopes Hi Bruno, I have updated the implementation so that it now follows the upstream skeleton.

bcardosolopes commented 6 months ago

Let me know once this is ready for review again!

bcardosolopes commented 6 months ago

@lanza rebased over the weekend, sorry for the churn. Can you rebase this PR again?

Lancern commented 6 months ago

Rebased onto the latest main.

Lancern commented 5 months ago

@bcardosolopes Hi Bruno, I believe this PR is now ready for another round of review.

I decide to keep this PR simpler (but the size of the PR is still quite big). In the latest version I made the following adjustments:

All supported operations are tested in clang/test/CIR/CodeGen/complex.c . Support for more complex number operations can be added in future PRs.

Lancern commented 4 months ago

After looking the PR, I'm not sure we need #cir.imag, it should just be an int or fp attribute directly.

I prefer keeping the #cir.complex to be used directly with cir.const, it composes well with the rest of CIR consts.

Hi Bruno, thanks for your reviews here. The problem with #cir.complex is that there is no literal expression in C that corresponds to it directly. One may think that the initializer list expression {real, imag} should be lowered to a #cir.complex but this could only happen when real and imag are both constant expressions. Otherwise you have no choice but fallback to cir.complex.create. Of course we could make {real, imag} lower to cir.const #cir.complex by trying to constant-evaluate the two elements during CIRGen.

#cir.imag, on the other hand, corresponds to the literal expression 1i, 2i, etc. In clang these literal expressions are of complex type and this is exactly what I'm trying to model in CIR here.

Lancern commented 4 months ago

Rebased onto the latest main.

Lancern commented 4 months ago

I agree that #cir.imag looks inconsistent and not promising. Removed it from this PR, and now imaginary literals such as 1i is lowered to three ops: two cir.const and one cir.complex.create:

int _Complex c = 3i;
// %0 = cir.const #cir.int<0> : !s32i
// %1 = cir.const #cir.int<3> : !s32i
// %c = cir.complex.create %0, %1 : !s32i -> !cir.complex<!s32i>

Besides, updated #cir.zero so that it can be used for creating zero values of complex type:

int _Complex c;
// cir.global external @c = #cir.zero : !cir.complex<!s32i>
Lancern commented 4 months ago

Ok, I was thinking we could have a fold operation of sorts to canonicalize this

This looks nice. I'll add this in another PR once this PR lands!

bcardosolopes commented 4 months ago

This looks nice. I'll add this in another PR once this PR lands!

Sounds good!

Lancern commented 4 months ago

Rebased onto the latest main.

Lancern commented 4 months ago

Rebased onto the latest main.

bcardosolopes commented 4 months ago

Pending replies to the questions asked.

Lancern commented 4 months ago

Pending replies to the questions asked.

It seems to me that all questions should be resolved? Am I missing some questions? GitHub does not show my response to your latest question in the thread, it's here: https://github.com/llvm/clangir/pull/513#discussion_r1649904257

bcardosolopes commented 4 months ago

It seems to me that all questions should be resolved? Am I missing some questions? GitHub does not show my response to your latest question in the thread, it's here: #513 (comment)

Perfect, got confused by github! All seems good then, thanks for all the hard work here!