llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.44k stars 11.75k forks source link

incorrect values of variable templates #61028

Open Taw3e8 opened 1 year ago

Taw3e8 commented 1 year ago

Bug occurs at current x86_64 Clang (trunk) but not on x86_64 Clang 15.0.0. Relevant code on Godbolt: https://godbolt.org/z/7v8jaM8r6 (templated variables have incorrect values eg: fib<4> should be 3 like in other compilers). If fib<10>; is uncommented issue disappears.

#include <iostream>

template<int N>
long long fib = fib<N-1> + fib<N-2>;

template<>
long long fib<1> = 1;

template<>
long long fib<0> = 0;

int main()
{
    std::cout << "0 1 2 3 4 5\n";
    //[[maybe_unused]] int i = fib<10>;
    std::cout << fib<0> << ' ';
    std::cout << fib<1> << ' ';
    std::cout << fib<2> << ' ';
    std::cout << fib<3> << ' ';
    std::cout << fib<4> << ' ';
    std::cout << fib<5> << ' ';

    return 0;
}
llvmbot commented 1 year ago

@llvm/issue-subscribers-c-1

llvmbot commented 1 year ago

@llvm/issue-subscribers-clang-codegen

AaronBallman commented 1 year ago

Erich, feel free to reassign as you see fit -- I mostly just don't want this to fall off anyone's plate, because this seems like a release blocker to me.

erichkeane commented 1 year ago

I messed around with this a bit, and discovered that the problem is the global-constructor order.

@llvm.global_ctors = appending global [4 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 65535, ptr @__cxx_global_var_init, ptr @_Z3fibILi2EE }, { i32, ptr, ptr } { i32 65535, ptr @__cxx_global_var_init.2, ptr @_Z3fibILi4EE }, { i32, ptr, ptr } { i32 65535, ptr @__cxx_global_var_init.1, ptr @_Z3fibILi3EE }, { i32, ptr, ptr } { i32 65535, ptr @__cxx_global_var_init.3, ptr @_Z3fibILi5EE }]

Note that is init'ing fib<2>, then fib<4>, then fib<3> and finally fib<5>.

I have no idea what could have caused that, but it is at least not a template failure (whew!). Perhaps @efriedma-quic or @rjmccall is around and can look into this? This is ABSOLUTELY a release blocker IMO.

erichkeane commented 1 year ago

This is caused by https://github.com/llvm/llvm-project/commit/f9969a3d28e738e9427e371aac06d71269220123 @yuanfang-chen : I'm VERY tempted to have you revert that patch, unless you have a simple fix we can cherry-pick to 16.0.

The problem is that:

fib<2>: Lex-order: 4
fib<3>: Lex-Order: 7
fib<4>: Lex-Order:6
fib<5>: Lex-order: 8

So whatever the source of that lex-ordering is incorrect.

I SEE that this issue came up during review of the follow-up patch to that: https://reviews.llvm.org/D126341 which was never submitted. Do you have a patch to fix this based on @zygoloid and @rnk 's feedback that you can contribute SOON? Or should we just revert?

yuanfang-chen commented 1 year ago

Isn't this cwg362? (which is unspecified behavior). However, https://github.com/llvm/llvm-project/commit/f9969a3d28e738e9427e371aac06d71269220123 is to fix a correctness issue.

erichkeane commented 1 year ago

Isn't this cwg362? (which is unspecified behavior). However, f9969a3 is to fix a correctness issue.

That seems to refer to class template static data members, or has that language since been expanded? It would be a REAL shame to mess this up. I'll leave it to @AaronBallman to decide what we want to do with it. What is the 'correctness' issue that the patch fixes? The test doesn't seem to be particularly motivating comparatively.

yuanfang-chen commented 1 year ago

This test case should be fixed by https://reviews.llvm.org/D127259. Feel free to revert and we could reland the fix together with D127259 once it's approved.

yuanfang-chen commented 1 year ago

But then we have a regression on https://github.com/llvm/llvm-project/issues/55804.

erichkeane commented 1 year ago

This test case should be fixed by https://reviews.llvm.org/D127259. Feel free to revert and we could reland the fix together with D127259 once it's approved.

D127259 seems to be waiting on you. Reading it, I don't see there being much compile time impact of note, and I think it is acceptable as is, so perhaps the solution is to just merge that as the fix for this, and cherry-pick it to 16.0?

But then we have a regression on #55804.

Got it, thanks!

AaronBallman commented 1 year ago

I think this is https://eel.is/c++draft/basic.start.dynamic#1 and Clang is not technically incorrect with its behavior because the initialization is unordered. fib<0> and fib<1> are fine, but fib<2> and higher values cause an implicit instantiation that is unordered. You can work around the issue by using constexpr because that stops the dynamic initialization from happening. So I'm going to remove the miscompilation tag I added earlier as I don't think this is a miscompile. I'm leaving the regression tag for the moment because this still has the potential to silently change program behavior in unexpected ways.

shafik commented 1 year ago

Just for clarification fib<0> and fib<1> are statically initialized but when initializing fib<2> it implicitly instantiates fib<3> and fib<4> and those initializations are dynamic and unordered.

tstellar commented 1 year ago

What's the status of https://reviews.llvm.org/D127259, is this going to be merged?

AaronBallman commented 1 year ago

What's the status of https://reviews.llvm.org/D127259, is this going to be merged?

That landed in e423885e272c0e57c6d240d07bc934e0a8649417. However, I fetched the changes, rebuilt Clang, and tried the example from this report and it continues to give the same behavior as before:

F:\source\llvm-project>cat "C:\Users\aballman\OneDrive - Intel Corporation\Desktop\test.cpp"
#include <iostream>

template<int N>
long long fib = fib<N-1> + fib<N-2>;

template<>
long long fib<1> = 1;

template<>
long long fib<0> = 0;

int main()
{
        std::cout << "0 1 2 3 4 5\n";
        //[[maybe_unused]] int i = fib<10>;
        std::cout << fib<0> << ' ';
        std::cout << fib<1> << ' ';
        std::cout << fib<2> << ' ';
        std::cout << fib<3> << ' ';
        std::cout << fib<4> << ' ';
        std::cout << fib<5> << ' ';

        return 0;
}

F:\source\llvm-project>llvm\out\build\x64-Debug\bin\clang.exe "C:\Users\aballman\OneDrive - Intel Corporation\Desktop\test.cpp"

F:\source\llvm-project>a.exe
0 1 2 3 4 5
0 1 1 2 1 3

so it does not appear to have fixed this issue. That makes me lean towards reverting the original patch. @yuanfang-chen -- thoughts?

yuanfang-chen commented 1 year ago

Hi @AaronBallman, thanks for catching this. It looks like the variable templates are handled slightly differently. I'm looking at this today. If it's not something that could be fixed soon, I'll revert the original patch.

Taw3e8 commented 1 year ago

@AaronBallman I'm afraid reverting might not help: https://godbolt.org/z/xEKTb16Go

#include <iostream>

template<int N>
long long fib = fib<N-1> + fib<N-2>;

template<>
long long fib<1> = 1;

template<>
long long fib<0> = 0;

int main()
{
    std::cout << "fib<10> = " << fib<10> << '\n';

    return 0;
}

this code prints:

gcc (trunk): // correct fib<10> = 55

clang (trunk): fib<10> = 34

clang 3.5: fib<10> = 0

clang 15.0.0: fib<10> = 0

So it seems the issue dates way back to clang 3.5 (c++14).

The code from first example https://godbolt.org/z/7v8jaM8r6 works since clang 3.7

Sorry for the confusion...

rjmccall commented 1 year ago

I mean, the code is wrong. It's certainly regrettable that C++ works this way, but the order of initialization of templated global variables is (mostly) not guaranteed, and given the language model for them (eager initialization before the start of main) I can't really imagine a way to guarantee the desired behavior. If constexpr isn't good enough, I suggest using static locals, which have a much better language model (on-demand initialization prior to use) that ensures initialization in dependency order.

Taw3e8 commented 1 year ago

Maybe a warning then? I think it would be very beneficial since it is not obvious what the initialization order is. I think simple suggestion to make a variable constexpr would suffice in this case.

rjmccall commented 1 year ago

We could probably figure out a way to warn here, yeah. Maybe on any reference from the dynamic initializer of one global variable to another variable that we can see has dynamic initialization but where there isn't a required initialization inter-ordering?

AaronBallman commented 1 year ago

I mean, the code is wrong.

Agreed, the code has unspecified behavior. That said, https://reviews.llvm.org/D127259 was an experiment that doesn't seem to have panned out, so I think it does make sense to revert that. I thought I had read a comment in https://reviews.llvm.org/D127233 that suggested it was also an experiment, but I think I must have confused it with the other review.

Adding a warning for this case would be great (for Clang 17+).

llvmbot commented 1 year ago

@llvm/issue-subscribers-clang-frontend

tru commented 1 year ago

should this be picked over to the release branch as well?

AaronBallman commented 1 year ago

should this be picked over to the release branch as well?

I don't think there's a strong reason to do so, but if someone feels otherwise, they're welcome to make a case.

tstellar commented 1 year ago

Moving this out to 16.0.1 rather than closing to give people more time to advocate for the backport.

tru commented 1 year ago

Do we still want to do something about this issue in 16.x or can I drop it from our queue?

AaronBallman commented 1 year ago

I think it can be dropped from the queue.

cor3ntin commented 2 months ago

FYI the original behavior is still observed in trunk https://godbolt.org/z/TjWYde9eq @yuanfang-chen