microsoft / llvm-mctoll

llvm-mctoll
Other
816 stars 125 forks source link

Fix SSE stack promotion #142

Closed martin-fink closed 3 years ago

martin-fink commented 3 years ago

This PR consists of three related commits:

  1. Update reinterpretSSERegValue to output more efficient code Previously, double -> <4 x int> would have resulted in the following bitcode:

    ; previously
    %0 = bitcast double %arg1 to i64
    %1 = zext i64 %0 to i128
    %2 = bitcast i128 %1 to <4 x i32>
    ; now
    %0 = insertelement <2 x double> zeroinitializer, double %arg1, i64 0
    %1 = bitcast <2 x double> %0 to <4 x i32>

    Another example for going <4 x int> -> double:

    ; previously
    %0 = bitcast <4 x int> %arg1 to i128
    %1 = trunc i128 %0 to i64
    %2 = bitcast i64 %1 to double
    ; now
    %0 = bitcast <4 x int> %arg1 to <2 x double>
    %1 = extractelement <2 x double> %0, i64 0
  2. Pass variadic arguments as double Discovered variadic arguments should always be a double or a float. When we discover a vector type for an argument, we assume it should be double.

  3. Save SSE values in a <4 x i32> stack slot instead of i128 This changes the type of stack slots for xmm registers to <4 x i32>. This fixes an issue where values loaded from stack slots would not be treated as SSE values.

I've added tests for commits 2. and 3., but not for 1., as it does not add any functionality. I've checked that all existing tests still pass with the modification.