chapel-lang / chapel

a Productive Parallel Programming Language
https://chapel-lang.org
Other
1.77k stars 416 forks source link

GCC vs. Clang (Chpl compiler) Optimization Issue #22312

Open kwaters4 opened 1 year ago

kwaters4 commented 1 year ago

Summary of Problem

I was working on another issue when I encountered this issue between Clang (Chpl) and GCC. It may be out of scope for the Chapel compiler, but the compiler writers maybe interested in this issue.

When I was writing a sin function calculator I saw two different performances between the GCC compiler and Clang (Chpl) compiler. This was discovered with working with C-extern functions and replicated separately with the Clang compiler. (Version 11.0.0 and 14.0.6) I am not sure how much the Chapel compiler is doing when it comes to external C code or if the issue is all with the LLVM back-end.

Please feel free to close this ticket if you think this should be submitted further upstream.

Steps to Reproduce

Adding a return statement is the trigger between the performance. This was noticed when returning a value from a C-extern function.

Source Code: Optimized/Fast Code with Return Statement:

#include <stdint.h>
#include <stdio.h>
#include <math.h>
#include <stdlib.h>
#include <time.h>

int main() {

    time_t t;
    srand((unsigned )time(&t));
    int size = 256;
    int iterations = 100000000;
    float answer = 0;

    for (int i=0; i < iterations; i++) {
        int random_number = rand() % (size-1);
        answer = 2 * 6.28 * sin(2.0 * 3.1415927 * (float) random_number / (float)size);
    }
}

With Return Statement

#include <stdint.h>
#include <stdio.h>
#include <math.h>
#include <stdlib.h>
#include <time.h>

int main() {

    time_t t;
    srand((unsigned) time(&t));
    int size = 256;
    int iterations = 100000000;
    float answer = 0;

    for (int i = 0; i < iterations; i++) {
        int random_number = rand() % (size-1);
        answer = 2 * 6.28 * sin(2.0 * 3.1415927 * (float) random_number / (float) size);
    }

    return answer;
}

Compile command: gcc sin.c -O3 -lm clang sin.c -O3 -lm

Execution command: time ./sin.x

Time for GCC w/o return: 0m0.441s Time for Clang w/o return: 0m0.537s

Time for GCC with return: 0m0.445s Time for Clang with return: 0m2.317s

Configuration Information

mppf commented 1 year ago

Especially since there is a sin involved, the issue might have to do with vectorization. One thing that I ran into recently is this:

https://github.com/llvm/llvm-project/blob/870eb04f1005da8278673f3cd1d1a640d16b63e6/llvm/include/llvm/Analysis/TargetLibraryInfo.h#L85-L100

AFAIK we aren't selecting a vector library today in the Chapel compiler, which would mean that we won't vectorize calls to functions like sin.