llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
27.05k stars 11.08k forks source link

Unions's and parameters's lowering ln Clang #57842

Closed sidtver closed 1 year ago

sidtver commented 1 year ago

Lets consider simple example in C-language.

#include <stdint.h>
typedef struct { uint64_t a; uint32_t b; uint64_t c; } S1;
typedef struct { uint32_t a; uint64_t b; uint32_t c; } S2;
typedef union { S1 x; S2 y; } S3;

extern void g( S3);
extern S3 A, B;

void f() {
    S3 v;
    B = A;
    v.x.a = 3;
    v.y.b = 5LL << 32LL;
    v.x.c = 7;
    g( v);
}

There are two circumstances. (1) The llvm-IR hasn’t unions in its type system. It has only structs. (2) Load and store operations for structs copy only struct’s fields, not whole memory area of struct as in C/C++.

So lets see llvm-IR of the example for x86.

%union.S3 = type { %struct.S1 }
%struct.S1 = type { i64, i32, i64 }

Clang transforms union S3 into the struct with name %union.S3. Only the first field S3.x was saved, and the second field S3.y was deleted.

tail call void @llvm.memcpy.p0i8.p0i8.i64(i8* noundef nonnull align 8 dereferenceable(24) bitcast (%union.S3* @B to i8*), i8* noundef nonnull align 8 dereferenceable(24) bitcast (%union.S3* @A to i8*), i64 24, i1 false), !tbaa.struct !3

Copy operation from B to A was made as a call of intrinsic @llvm.memcpy.* It is an imposible to load value of B and store the loaded value to A. Because in this case (in x86-backend) the field S3.x.b will be copied as 32-bit value. So high 32 bits of S3.y.b will be lost. It is easy to check.

The x86-ABI says to pass struct’s values by reference:

tail call void @g(%union.S3* nonnull byval(%union.S3) align 8 %1) #4

That is why there is no copy operation in IR for formal parameters’s values in call of function g.

Now lets see llvm-IR for MIPS (mips64el).

; Function Attrs: nounwind
define void @f() local_unnamed_addr #0 {
  tail call void @llvm.memcpy.p0i8.p0i8.i64(i8* noundef nonnull align 8 dereferenceable(24) bitcast (%union.S3* @B to i8*), i8* noundef nonnull align 8 dereferenceable(24) bitcast (%union.S3* @A to i8*), i64 24, i1 false), !tbaa.struct !3
  tail call void @g(i64 inreg 3, i64 inreg 21474836480, i64 inreg 7) #3
  ret void
}

declare void @g(i64 inreg, i64 inreg, i64 inreg) local_unnamed_addr #2

The Clang frontend split struct S3 into three i64 values. It was made for parameters of functions g. Also access operations to struct fields are transformed to bit-operations.

All together (1) and (2) force to apply very low-level target dependent (stack and registers layer) transformation in Clang frontend. It causes unnatural transformation in IR for targets with direct transport of struct values. It causes premature optimization in single frontend (Clang) when targets backends are possible to make parameters’s transportation in valid way. And a uniform optimization (MachineInstruction or llvm-IR stage) is needed for all frontends. Not only for Clang.

As an alternative I see a possible change of union’s lowering as follows. It may be a llvm first-time pass or internal clang-AST to llvm-IR functionality. Suppose for all structures with indirect paddings was made an embeding of additional fields. For example lets substitute

%struct.S1 = type { i64, i32, i64 }

into

%struct.S1 = type { i64, i32, i32, i64 }
                               ^
                               |
                               inserted field

After such transformation it becames possible to use load/store operations for coping and to use direct passing of parameters with struct type in IR.

llvmbot commented 1 year ago

@llvm/issue-subscribers-clang-codegen

efriedma-quic commented 1 year ago

clang has code to fill in padding of structs, if we want to... we just avoid it when possible for the sake of making the IR more readable. See CGRecordLowering::insertPadding .

We avoid loads and stores of struct types for other reasons. Mostly related to optimizations. The SelectionDAG doesn't really handle them efficiently. And as a consequence of us avoiding them, IR optimizations don't really handle them well either.

If you have other questions about the current design, you can start a thread on Discourse (https://discourse.llvm.org). A bugtracker isn't really a good place for that sort of thing.

sidtver commented 1 year ago

Thank you for the information provided. This "issue" is now complete, I didn't know about Discourse before. I will think about it and maybe ask a revised question in Discourse.