llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.79k stars 11.9k forks source link

Accidental equality of classes templated by pointer to local static constant of templated function #47861

Open beef2f8d-76f6-467d-9088-334ee1ff5567 opened 3 years ago

beef2f8d-76f6-467d-9088-334ee1ff5567 commented 3 years ago
Bugzilla Link 48517
Version 5.0
OS Windows NT
CC @dwblaikie,@pogo59,@zygoloid
Fixed by commit(s) 6b760a50f52142e401a6380ff71f933cda22a909

Extended Description

Based on https://stackoverflow.com/q/65306562/147192.

The behavior of Clang (and GCC) is inconsistent (between compile-time and run-time), however it is unclear to me whether the inconsistency is conforming with the C++17 standard or not.

Furthermore, in -O0 mode, Clang generates an unused symbol (create()::I) which causes the linker to fail, see https://godbolt.org/z/M4T7f3.

The following reduced program is expected to return 0 (invoking clang++ with -std=c++17), it does not (https://godbolt.org/z/6r6vK3):

template <typename, typename>
struct is_same { static constexpr bool value = false; };

template <typename T>
struct is_same<T, T> { static constexpr bool value = true; };

template <typename T, typename U>
static constexpr bool is_same_v = is_same<T, U>::value;

using uintptr_t = unsigned long long;

template <int const* I>
struct Parameterized { int const* member; };

template <typename T>
auto create() {
    static constexpr int const I = 2;

    return Parameterized<&I>{ &I };
}

int main() {
    auto one = create<short>();
    auto two = create<int>();

    if (is_same_v<decltype(one), decltype(two)>) {
        return reinterpret_cast<uintptr_t>(one.member) == reinterpret_cast<uintptr_t>(two.member) ? 1 : 2;
    }

    return 0;
}

Yet, on all versions of Clang where it compiles (from 5.0.0 onwards), and for all optimization levels (from -O1 to -O3), it returns 2, indicating:

The assembly listing clearly contains 2 different instances of create<T>()::I.

Notes:

dwblaikie commented 3 years ago

So clang creates the GlobalVariable first at the behest of debug info emission here:

llvm::GlobalVariable::GlobalVariable
      at llvm/lib/IR/Globals.cpp:365
clang::CodeGen::CodeGenModule::GetOrCreateLLVMGlobal
      at clang/lib/CodeGen/CodeGenModule.cpp:3712
clang::CodeGen::CodeGenModule::GetAddrOfGlobalVar
      at clang/lib/CodeGen/CodeGenModule.cpp:3928
clang::CodeGen::CGDebugInfo::CollectTemplateParams
      at clang/lib/CodeGen/CGDebugInfo.cpp:1904
clang::CodeGen::CGDebugInfo::CollectCXXTemplateParams
      at clang/lib/CodeGen/CGDebugInfo.cpp:2023
clang::CodeGen::CGDebugInfo::CreateLimitedType
      at clang/lib/CodeGen/CGDebugInfo.cpp:3415
clang::CodeGen::CGDebugInfo::getOrCreateLimitedType
      at clang/lib/CodeGen/CGDebugInfo.cpp:3322
clang::CodeGen::CGDebugInfo::CreateTypeDefinition
      at clang/lib/CodeGen/CGDebugInfo.cpp:2402
clang::CodeGen::CGDebugInfo::CreateType
      at clang/lib/CodeGen/CGDebugInfo.cpp:2387
clang::CodeGen::CGDebugInfo::CreateTypeNode
      at clang/lib/CodeGen/CGDebugInfo.cpp:3260
clang::CodeGen::CGDebugInfo::getOrCreateType
      at clang/lib/CodeGen/CGDebugInfo.cpp:3178
clang::CodeGen::CGDebugInfo::EmitDeclare
      at clang/lib/CodeGen/CGDebugInfo.cpp:4185
clang::CodeGen::CGDebugInfo::EmitDeclareOfAutoVariable
      at clang/lib/CodeGen/CGDebugInfo.cpp:4306
clang::CodeGen::CodeGenFunction::EmitAutoVarAlloca
      at clang/lib/CodeGen/CGDecl.cpp:1604
clang::CodeGen::CodeGenFunction::EmitAutoVarDecl
      at clang/lib/CodeGen/CGDecl.cpp:1308
clang::CodeGen::CodeGenFunction::EmitVarDecl
      at clang/lib/CodeGen/CGDecl.cpp:208
clang::CodeGen::CodeGenFunction::EmitDecl
      at clang/lib/CodeGen/CGDecl.cpp:153
clang::CodeGen::CodeGenFunction::EmitDeclStmt
      at clang/lib/CodeGen/CGStmt.cpp:1250
clang::CodeGen::CodeGenFunction::EmitSimpleStmt
      at clang/lib/CodeGen/CGStmt.cpp:386
clang::CodeGen::CodeGenFunction::EmitStmt
      at clang/lib/CodeGen/CGStmt.cpp:55
clang::CodeGen::CodeGenFunction::EmitCompoundStmtWithoutScope
      at clang/lib/CodeGen/CGStmt.cpp:476
clang::CodeGen::CodeGenFunction::EmitFunctionBody
      at clang/lib/CodeGen/CodeGenFunction.cpp:1181
clang::CodeGen::CodeGenFunction::GenerateCode
      at clang/lib/CodeGen/CodeGenFunction.cpp:1351
clang::CodeGen::CodeGenModule::EmitGlobalFunctionDefinition
      at clang/lib/CodeGen/CodeGenModule.cpp:4731
clang::CodeGen::CodeGenModule::EmitGlobalDefinition
      at clang/lib/CodeGen/CodeGenModule.cpp:3081
clang::CodeGen::CodeGenModule::EmitGlobal
      at clang/lib/CodeGen/CodeGenModule.cpp:2833
clang::CodeGen::CodeGenModule::EmitTopLevelDecl
      at clang/lib/CodeGen/CodeGenModule.cpp:5546
#&#8203;27 0x000000000a789442 in
      at clang/lib/CodeGen/ModuleBuilder.cpp:170
clang::BackendConsumer::HandleTopLevelDecl
      at clang/lib/CodeGen/CodeGenAction.cpp:218
clang::ParseAST
      at clang/lib/Parse/ParseAST.cpp:162
clang::ASTFrontendAction::ExecuteAction
      at clang/lib/Frontend/FrontendAction.cpp:1056
clang::CodeGenAction::ExecuteAction
      at clang/lib/CodeGen/CodeGenAction.cpp:1082
clang::FrontendAction::Execute
      at clang/lib/Frontend/FrontendAction.cpp:949
clang::CompilerInstance::ExecuteAction
      at clang/lib/Frontend/CompilerInstance.cpp:957
clang::ExecuteCompilerInvocation
      at clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:278
cc1_main 
      at clang/tools/driver/cc1_main.cpp:240
ExecuteCC1Tool
      at clang/tools/driver/driver.cpp:330
main  
      at clang/tools/driver/driver.cpp:407

And then seems to try to create it again at:

llvm::GlobalVariable::GlobalVariable 
  at llvm/lib/IR/Globals.cpp:365
clang::CodeGen::CodeGenModule::getOrCreateStaticVarDecl
  at clang/lib/CodeGen/CGDecl.cpp:266
clang::CodeGen::CodeGenFunction::EmitStaticVarDecl
  at clang/lib/CodeGen/CGDecl.cpp:398
clang::CodeGen::CodeGenFunction::EmitVarDecl
  at clang/lib/CodeGen/CGDecl.cpp:201
clang::CodeGen::CodeGenFunction::EmitDecl
  at clang/lib/CodeGen/CGDecl.cpp:153
clang::CodeGen::CodeGenFunction::EmitDeclStmt 
  at clang/lib/CodeGen/CGStmt.cpp:1250
clang::CodeGen::CodeGenFunction::EmitSimpleStmt
  at clang/lib/CodeGen/CGStmt.cpp:386
clang::CodeGen::CodeGenFunction::EmitStmt 
  at clang/lib/CodeGen/CGStmt.cpp:55
clang::CodeGen::CodeGenFunction::EmitCompoundStmtWithoutScope
  at clang/lib/CodeGen/CGStmt.cpp:476
clang::CodeGen::CodeGenFunction::EmitFunctionBody
  at clang/lib/CodeGen/CodeGenFunction.cpp:1181
clang::CodeGen::CodeGenFunction::GenerateCode
  at clang/lib/CodeGen/CodeGenFunction.cpp:1351
clang::CodeGen::CodeGenModule::EmitGlobalFunctionDefinition
  at clang/lib/CodeGen/CodeGenModule.cpp:4731
clang::CodeGen::CodeGenModule::EmitGlobalDefinition
  at clang/lib/CodeGen/CodeGenModule.cpp:3081
clang::CodeGen::CodeGenModule::EmitDeferred
  at clang/lib/CodeGen/CodeGenModule.cpp:2337
clang::CodeGen::CodeGenModule::Release
  at clang/lib/CodeGen/CodeGenModule.cpp:447
(anonymous namespace)::CodeGeneratorImpl::HandleTranslationUnit
  at clang/lib/CodeGen/ModuleBuilder.cpp:267
clang::BackendConsumer::HandleTranslationUnit
  at clang/lib/CodeGen/CodeGenAction.cpp:292
clang::ParseAST
  at clang/lib/Parse/ParseAST.cpp:171
clang::ASTFrontendAction::ExecuteAction
  at clang/lib/Frontend/FrontendAction.cpp:1056
clang::CodeGenAction::ExecuteAction
  at clang/lib/CodeGen/CodeGenAction.cpp:1082
clang::FrontendAction::Execute
  at clang/lib/Frontend/FrontendAction.cpp:949
clang::CompilerInstance::ExecuteAction
  at clang/lib/Frontend/CompilerInstance.cpp:957
clang::ExecuteCompilerInvocation
  at clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:278
cc1_main
  at clang/tools/driver/cc1_main.cpp:240
ExecuteCC1Tool
  at clang/tools/driver/driver.cpp:330
main
  at clang/tools/driver/driver.cpp:407

I haven't tested much of why this would be noteworthy for this new feature and not in many other cases of debug info and C++ constructs - perhaps Richard knows what might be at work here.

Very naively it looks like getOrCreateStaticVarDecl should be using getOrCreateLLVMGlobal to ensure this sort of duplicate creation doesn't happen?

dwblaikie commented 3 years ago

Fascinating. Seems to happen in the frontend/IR generation..

With debug info (but -Xclang -disable-llvm-passes):

$_ZZ6createIsEDavE1I.1 = comdat any

$_ZZ6createIiEDavE1I.2 = comdat any

@_ZZ6createIsEDavE1I = external dso_local constant i32, align 4
@_ZZ6createIiEDavE1I = external dso_local constant i32, align 4
@_ZZ6createIsEDavE1I.1 = linkonce_odr dso_local constant i32 2, comdat, align 4, !dbg !0
@_ZZ6createIiEDavE1I.2 = linkonce_odr dso_local constant i32 2, comdat, align 4, !dbg !23

Compared to (without debug info):

$_ZZ6createIsEDavE1I = comdat any

$_ZZ6createIiEDavE1I = comdat any

@_ZZ6createIsEDavE1I = linkonce_odr dso_local constant i32 2, comdat, align 4
@_ZZ6createIiEDavE1I = linkonce_odr dso_local constant i32 2, comdat, align 4
dwblaikie commented 3 years ago

On my list to take a look - if anyone else is poking around with this before me (few days at least) please update the bug with any findings and we can collaborate here.

ec04fc15-fa35-46f2-80e1-5d271f2ef708 commented 3 years ago

The remaining issue looks like a bug in debug information generation. If you build with -g0, the problem goes away (and, disturbingly, if you build with -g, the debug info symbols get the proper mangling and the actually-referenced-from-the-code symbols get the wrong .1 / .2 manglings). It's not clear to me if the problem goes deeper than debug info generation, though.

beef2f8d-76f6-467d-9088-334ee1ff5567 commented 3 years ago

Hello Richard,

The situation did improve your patch (Thanks!), though the problem is not completely solved.

Selecting "Clang (trunk)" on godbolt (https://godbolt.org/z/9Ybqq5) which is based on 4c8c6368710 (2020-12-16 15:38:58 -0800) so after 6b760a50f52 (2020-12-15 13:23:08 -0800) we can see:

The specific linker error:

/opt/compiler-explorer/gcc-snapshot/lib/gcc/x86_64-linux-gnu/11.0.0/../../../../x86_64-linux-gnu/bin/ld:
/tmp/example-a44fce.o:(.debug_info+0xc2): undefined reference to `create<short>()::I'
/opt/compiler-explorer/gcc-snapshot/lib/gcc/x86_64-linux-gnu/11.0.0/../../../../x86_64-linux-gnu/bin/ld:
/tmp/example-a44fce.o:(.debug_info+0xf1): undefined reference to `create<int>()::I'

Those symbols looks exactly like what was generated in the assembly in Clang 11.0, however now in trunk the symbols generated in the assembly are:

_ZZ6createIsEDavE1I.1:
        .long   2                               # 0x2

_ZZ6createIiEDavE1I.2:
        .long   2                               # 0x2

These extra .1 and .2 are different from before.

It appears that the linker still expects to find _ZZ6createIsEDavE1I (no .1) and _ZZ6createIiEDavE1I (no .2), for some reason.

ec04fc15-fa35-46f2-80e1-5d271f2ef708 commented 3 years ago

Thanks for the report, fixed in Clang trunk.

llvmbot commented 3 months ago

@llvm/issue-subscribers-debuginfo

Author: None (beef2f8d-76f6-467d-9088-334ee1ff5567)

| | | | --- | --- | | Bugzilla Link | [48517](https://llvm.org/bz48517) | | Version | 5.0 | | OS | Windows NT | | CC | @dwblaikie,@pogo59,@zygoloid | | Fixed by commit(s) | 6b760a50f52142e401a6380ff71f933cda22a909 | ## Extended Description Based on https://stackoverflow.com/q/65306562/147192. The behavior of Clang (and GCC) is inconsistent (between compile-time and run-time), however it is unclear to me whether the inconsistency is conforming with the C++17 standard or not. Furthermore, in `-O0` mode, Clang generates an unused symbol (`create()::I`) which causes the linker to fail, see https://godbolt.org/z/M4T7f3. The following reduced program is expected to return 0 (invoking clang++ with `-std=c++17`), it does not (https://godbolt.org/z/6r6vK3): ```cpp template <typename, typename> struct is_same { static constexpr bool value = false; }; template <typename T> struct is_same<T, T> { static constexpr bool value = true; }; template <typename T, typename U> static constexpr bool is_same_v = is_same<T, U>::value; using uintptr_t = unsigned long long; template <int const* I> struct Parameterized { int const* member; }; template <typename T> auto create() { static constexpr int const I = 2; return Parameterized<&I>{ &I }; } int main() { auto one = create<short>(); auto two = create<int>(); if (is_same_v<decltype(one), decltype(two)>) { return reinterpret_cast<uintptr_t>(one.member) == reinterpret_cast<uintptr_t>(two.member) ? 1 : 2; } return 0; } ``` Yet, on all versions of Clang where it compiles (from 5.0.0 onwards), and for all optimization levels (from `-O1` to `-O3`), it returns 2, indicating: - That `one` and `two` have the same type -- which according to 17.4 [temp.type] should mean that they point to the same object. - Yet they point to different objects -- there are two instances of `create<T>()::I`, one for `T = short` and one for `T = int`. The assembly listing clearly contains 2 different instances of `create<T>()::I`. Notes: - If `I` is initialized with `= sizeof(T)`, instead, then with -O1 to -O3 the program returns 0 as expected. - Even with `I` initialized with `= sizeof(T)`, with -O0 Clang still generates an unused symbol which causes the linker to fail: `auto create<short>()::I` and `auto create<int>()::I`, whereas the declared symbols do not have the leading `auto`.