llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.05k stars 11.98k forks source link

[LoopVectorize] Clang fails to vectorize simple polynomial hash #56999

Open xortator opened 2 years ago

xortator commented 2 years ago

Here is example of a simple polynomial hash that could not be auto-vectorized with SSE 4.1 (LV could not prove legality):

https://godbolt.org/z/86o9zPT51 Original test:

unsigned rabin_karp_naive(unsigned *s, unsigned len) {
    unsigned hash = 0;
    for (unsigned i = 0; i < len; i++) {
        hash *= 31;
        hash += s[i];
    }
    return hash;
}

This is a very typical example of polynomial hash, used, for example, in Rabin-Karp algorithm (https://en.wikipedia.org/wiki/Rabin%E2%80%93Karp_algorithm). Another iconic application is Java String hashcode.

When we restructure the code into its equivalent, LV can vectorize it:

unsigned rabin_karp_vector(unsigned *s, unsigned len) {
    unsigned hash = 0;
    unsigned k[4] = {31 * 31 * 31, 31 * 31, 31, 1};
    const unsigned vector_factor = 4;
    unsigned i = 0;
    for (i = 0; i + 3 < len; i += 4) {
        for (unsigned j = 0; j < 4; j++) {
            hash += k[j] * s[i + j];
        }
        for (unsigned j = 0; j < 4; j++) {
            k[j] *= 31 * 31 * 31 * 31;
        }
    }
    for (; i < len; i++) {
        hash *= 31;
        hash += s[i];
    }
    return hash;
}

It would be nice if the original example was vectorizable.

xortator commented 2 years ago

Another (also pretty straightforward) version of same algorithm doesn't vectorize either:

unsigned rabin_karp_naive_v2(unsigned *s, unsigned len) {
    unsigned hash = 0;
    unsigned mul = 1;
    for (unsigned i = 0; i < len; i++) {
        hash += mul * s[len - 1 - i];
        mul *= 31;
    }
    return hash;
}