manodeep / Corrfunc

⚡️⚡️⚡️Blazing fast correlation functions on the CPU.
https://corrfunc.readthedocs.io
MIT License
164 stars 50 forks source link

Compiler error with icc15 #100

Closed manodeep closed 7 years ago

manodeep commented 7 years ago

@lgarrison

The latest Corrfunc produces this compiler error on Stampede:

$ icc -DDOUBLE_PREC  -DVERSION=\"2.0.0\" -DUSE_UNICODE -std=c99 -m64 -g -Wsign-compare -Wall -Wextra -Wshadow -Wunused -fPIC -D_POSIX_SOURCE=200809L -D_GNU_SOURCE -D_DARWIN_C_SOURCE -O3  -xhost -opt-prefetch -opt-prefetch-distance=16  -openmp -I../../io -I../../utils  -c countpairs_impl_double.c -o countpairs_impl_double.o
": internal error: ** The compiler has encountered an unexpected problem.
** Segmentation violation signal raised. **
Access violation or stack overflow. Please contact Intel Support for assistance.

$ icc --version
icc (ICC) 15.0.2 20150121
Copyright (C) 1985-2015 Intel Corporation.  All rights reserved.

The error goes away if I remove the optimization options -xhost -opt*. However, adding in -march=native breaks the compilation again! Currently, there seems to be no way of adding in AVX code-generation on Stampede.

Achievement unlocked?

manodeep commented 7 years ago

Similar issue with icc13

$ icc --version
icc (ICC) 13.1.0 20130121
Copyright (C) 1985-2013 Intel Corporation.  All rights reserved.

$ icc -DDOUBLE_PREC  -DVERSION=\"2.0.0\" -DUSE_UNICODE -std=c99 -m64 -g -Wsign-compare -Wall -Wextra -Wshadow -Wunused -fPIC -D_POSIX_SOURCE=200809L -D_GNU_SOURCE -D_DARWIN_C_SOURCE -O3  -xhost -opt-prefetch -opt-prefetch-distance=16  -openmp -I../../io -I../../utils  -c countpairs_impl_double.c -o countpairs_impl_double.o
": internal error: backend signals

compilation aborted for countpairs_impl_double.c (code 4)
lgarrison commented 7 years ago

I confirm I can reproduce this on icc 14. The problem goes away when I remove -xhost, but then comes back as soon as I add either -mavx or -msse4.2.

This is a strange problem, because the weights implementation really parallels the rpavg implementation. There shouldn't be any surprises for the compiler in there!

lgarrison commented 7 years ago

The mocks build and run fine with icc, but none of the theory functions do. It seems likely that there's some shared file or code pattern that shows up in the theory functions, but not the mocks.

lgarrison commented 7 years ago

The problem somehow seems related to weight_union_DOUBLE in weight_functions.h.src, which is defined as

typedef union {
#ifdef __AVX__
    AVX_FLOATS a;
#endif
#ifdef __SSE4_2__
    SSE_FLOATS s;
#endif
    DOUBLE d;
} weight_union_DOUBLE;

Even without -xhost (so __AVX__ is never defined and no AVX code is built, and the a field is never referenced), you can change the definition (in the double precision case) to

#include <immintrin.h>
typedef union {
    __m256d a;
    double d;
} weight_union_double;

and get the same compiler failure.

Even with this insight, I've been unable to produce a minimal working example to demonstrate the failure, although I'll keep trying.

If all else fails, we can probably remove the weight_union entirely and create pair_struct_avx_DOUBLE and pair_struct_sse_DOUBLE. This actually wouldn't even be that ugly of a solution.

lgarrison commented 7 years ago

Found the problem and the fix! The last ingredient to trigger the failure is the inline struct initialization. Here's a MWE:

#include <immintrin.h>
#include <stdio.h>
#include <stdint.h>

#define MAX_NUM_WEIGHTS 10

typedef union {
    __m256d a;
    double d;
} weights_union;

typedef struct
{
    int64_t num_weights;
    weights_union weights0[MAX_NUM_WEIGHTS];
    weights_union dx;
} pair_struct;

int main(void){
    int64_t num_weights = 0;
    pair_struct pair = {.num_weights = num_weights,
                       .dx.d=0.
                       };

    return 0;
}

Compile with icc -std=c99 -Wall test.c -o test to get the failure.

There are two ways to fix this: swap __m256d a and double d in weights_union, or move int64_t num_weights to the last field in pair_struct. I've done the latter and will check it into the repo.

This problem is surely something to do with alignment requirements, but even so, icc probably shouldn't segfault!

manodeep commented 7 years ago

Since the variable with the highest alignment requirement should appear first anyway - swapping num_weights with weights0 is a good idea.

Curious though, do you get the failure if you change MAX_NUM_WEIGHTS to 8?

lgarrison commented 7 years ago

Yes, the failure seems insensitive to the value of MAX_NUM_WEIGHTS.

On Fri, Nov 11, 2016 at 5:21 PM, Manodeep Sinha notifications@github.com wrote:

Since the variable with the highest alignment requirement should appear first anyway - swapping num_weights with weights0 is a good idea.

Curious though, do you get the failure if you change MAX_NUM_WEIGHTS to 8?

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/manodeep/Corrfunc/issues/100#issuecomment-260069337, or mute the thread https://github.com/notifications/unsubscribe-auth/AFYPy4M9UJ5Ewpi_-WBQ6jcXDnZtn4qFks5q9Op8gaJpZM4KvNbc .

manodeep commented 7 years ago

Thanks!

manodeep commented 7 years ago

Reported to Intel Dev Forums: https://software.intel.com/en-us/forums/intel-c-compiler/topic/703112