jbush001 / NyuziToolchain

Port of LLVM/Clang C compiler to Nyuzi parallel processor architecture
Other
62 stars 28 forks source link

Loop vectorizer bloats code in some cases #103

Open jbush001 opened 5 years ago

jbush001 commented 5 years ago

The following code:

void* memset(void *dst, int c, unsigned int n)
{
    int i;
    for (i = 0; i < n; i++)
        ((char*) dst)[i] = c;

    return dst;
}

With the "Loop Vectorization" pass enabled will generate a very large unrolled loop with 1024 instructions.

    and s3, s2, 1023
    sub_i s3, s2, s3
    move s4, 0
    b .LBB0_3
.LBB0_3:                                # %vector.body
                                        # =>This Inner Loop Header: Depth=1
    add_i s5, s0, s4
    store_8 s1, 1(s5)
    store_8 s1, (s5)
    store_8 s1, 2(s5)
    store_8 s1, 3(s5)
    store_8 s1, 4(s5)
    store_8 s1, 5(s5)
    store_8 s1, 6(s5)
    store_8 s1, 7(s5)
...
    store_8 s1, 1023(s5)
    add_i s4, s4, 1024
    cmpne_i s5, s4, s3
    bnz s5, .LBB0_3
    b .LBB0_4

Also, it doesn't seem to actually be using vectors here. Probably need to tweak cost model to discourage it from doing this.

jbush001 commented 5 years ago

NyuziToolchain/lib/Transforms/Vectorize/LoopVectorize.cpp

jbush001 commented 5 years ago

This is currently disabled by default in tools/clang/lib/Driver/ToolChains/Clang.cpp.

@@@ -4447,6 -4852,6 +4854,10 @@@ void Clang::ConstructJob(Compilation &C
    // selected. For optimization levels that want vectorization we use the alias
    // option to simplify the hasFlag logic.
    bool EnableVec = shouldEnableVectorizerAtOLevel(Args, false);
++
++  // XXX Nyuzi
++  EnableVec = false;
++
    OptSpecifier VectorizeAliasOption =
        EnableVec ? options::OPT_O_Group : options::OPT_fvectorize;
    if (Args.hasFlag(options::OPT_fvectorize, VectorizeAliasOption,