llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.87k stars 11.93k forks source link

Missed optimization - over-strict alias analysis? #54646

Open OfekShilon opened 2 years ago

OfekShilon commented 2 years ago

This code:

struct S { long int a; long int b; };
void f(S& s1, const S& s2 ) {
    s1.a = s2.a;
    s1.b = s2.b;
}

optimizes well into vectorized assembly:

        movups  xmm0, xmmword ptr [rsi]
        movups  xmmword ptr [rdi], xmm0

Surprisingly, wrapping the primitive long ints in structs seems to prevent optimizations due to aliasing concerns. This:

struct Wrapper { long int t; };
struct S1 { Wrapper a; Wrapper b; };

void f(S1& s1, const S1& s2 ) {
    s1.a = s2.a;
    s1.b = s2.b;
}

produces suboptimal assembly:

        movq    (%rsi), %rax
        movq    %rax, (%rdi)
        movq    8(%rsi), %rax
        movq    %rax, 8(%rdi)

And an optimization-remark, produced by the gvn pass - typically on suspected aliasing:

load of type i64 not eliminated because it is clobbered by store

There was no aliasing detected on the un-wrapped source, it seems certainly there shouldn't be one on the wrapped source.

OfekShilon commented 2 years ago

This seems to be a very common pattern, obviously in copy ctors but also in many other contexts. Fixing this might have measurable impact.

Seems like the added layer of abstraction (Wrapper) somehow makes the compiler "forget" that the different fields do not alias.

OfekShilon commented 2 years ago

Note: this persists even when using distinct types to try and inform the compiler that the arguments can't alias by strict-aliasing:

struct Wrapper1 { long int t; };
struct Wrapper2 { long int t; };
struct S1 { Wrapper1 a; Wrapper2 b; };
struct S2 { Wrapper1 a; Wrapper2 b; };

void f(S1& s1, const S2& s2 ) {
    s1.a = s2.a;
    s1.b = s2.b;
}

still gives the suboptimal assembly -

f(S1&, S2 const&):                          # @f(S1&, S2 const&)
        movq    (%rsi), %rax
        movq    %rax, (%rdi)
        movq    8(%rsi), %rax
        movq    %rax, 8(%rdi)
        retq
jdoerfert commented 2 years ago

I think this is something the memory region intrinsics might be able to solve (https://reviews.llvm.org/D115274). @LebedevRI

OfekShilon commented 2 years ago

I think this is something the memory region intrinsics might be able to solve (https://reviews.llvm.org/D115274). @LebedevRI

@jdoerfert Shouldn't TBAA have kicked in at the last example?

struct Wrapper1 { long int t; };
struct Wrapper2 { long int t; };
struct S1 { Wrapper1 a; Wrapper2 b; };
struct S2 { Wrapper1 a; Wrapper2 b; };

void f(S1& s1, const S2& s2 ) {
    s1.a = s2.a;
    s1.b = s2.b;
}
davidbolvansky commented 2 years ago

cc @fhahn @nikic

OfekShilon commented 2 years ago

Also cc @manman-ren @kosarev Some more info bits for the single-wrapper-type case here:

struct Wrapper { long int t; };   // more surprises if you try 'int' instead..
struct S1 { Wrapper a; Wrapper b; };

// Good code generated:
void f1(S1& s1, const S1& s2 ) {
    s1=s2;
}

// Bad code generated:
void f2(S1& s1, const S1& s2 ) {
    s1.a = s2.a;
    s1.b = s2.b;
}

// Back to good code:
void f3(S1& s1, const S1& s2 ) {
    s1.a.t = s2.a.t;
    s1.b.t = s2.b.t;
}

About two wrapper types: I thought C++ strict-aliasing should kick in and prevent aliasing, but checking the source I'm not sure whether it's even implemented:

// If [the two locations] have different roots, they're part of different potentially // unrelated type systems, so we return Alias to be conservative.

The implementation indeed does that. Could it be that strict aliasing isn't implemented for compound types? Is it to serve some non-C++ needs?

OfekShilon commented 1 year ago

@kosarev I learnt today that the switch -new-struct-path-tbaa (your work) solves these alias-analysis problems: https://godbolt.org/z/zETbG8cdK I suspect just disabling this check would be enough for my narrow purposes (i.e., calling mergeTBAAInfoForMemoryTransfer unconditioned on the presence of -new-struct-path-tbaa) but can't say for certain.

What is the status of this switch? Can it be used in production? Is it on its way to be the new default behavior? I'm aware of problems it had in the past but AFAICT they're all solved. Your work finally makes strong-typedefs (i.e., wrappers for other types) really a zero-cost abstraction, and makes them as efficient as the underlying type. This can have a measurable impact on projects like ours, which have lots of usage of strong-typedef libraries. I hope it will be the default behavior soon.