agenium-scale / boost.simd

Boost SIMD
231 stars 50 forks source link

boost::simd::pow performance (MSVC) #532

Closed SuperflyJon closed 5 years ago

SuperflyJon commented 7 years ago

I'm new to simd programming and was testing the pow() function, between the standard library version:

    for (int i = 0; i < SIZE; i++)
        for (int j = 0; j < SIZE; j++)
            C[i][j] = pow(A[i][j], B[i][j]);

and using AVX with this library:

    for (int i = 0; i < SIZE; i++)
    {
        double *pA = A[i];
        double *pB = B[i];
        double *pC = C[i];
        for (int j = 0; j < SIZE; j += 4)
        {
            boost::simd::pack<double> sa(pA);
            boost::simd::pack<double> sb(pB);
            boost::simd::store(boost::simd::pow(sa, sb), pC);

            pA += 4;
            pB += 4;
            pC += 4;
        }
    }

I'm using 10 lots of 1024 by 1024 arrays in the test and the times are 350 ms for the standard library vs 7000 ms for the boost simd version.

Note that a similar test with multiplication worked fine (68ms -> 28ms).

I notice there are a few performance related issues with MSVC (I'm using the 2017 version), but in general would it be reasonable for the simd pow function to be faster than the std library version?

jfalcou commented 7 years ago

Hi

this is related to #479 and to the fact MSVC has some pre-vectorized stdlib function. We'll investigate this.