llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.06k stars 11.59k forks source link

IA64 ABI: RTTI name for class in anonymous namespace lacks '*', breaks dynamic_cast and type_info::operator== on GNU/Linux #34255

Open rprichard opened 6 years ago

rprichard commented 6 years ago
Bugzilla Link 34907
Version 5.0
OS Linux
CC @apolukhin,@k15tfu,@zygoloid,@rjmccall,@smeenai

Extended Description

libstdc++ apparently has a convention where the typeinfo name for a class declared in an anonymous namespace begins with an asterisk ('*'), which tells std::type_info::operator== to consider two type_info objects unequal even if their names are equal. Clang is not outputting this asterisk on GNU/Linux. Because it's omitted, if I declare two classes with the same name, in two different anonymous namespaces, the two class types are considered equal according to std::type_info::operator==, and I can cast from one type to another with dynamic_cast. G++ outputs the asterisk, so the types are treated as unequal.

The asterisk is stripped off in GNU's std::type_info::name(), so it's not user visible.

AFAICT, libc++ doesn't have this convention, but for ARM64 iOS, there is a different convention of setting the highest(?) bit of the type_info's __type_name pointer to indicate that string comparison should be performed. (Look for the _LIBCPP_HAS_NONUNIQUE_TYPEINFO and _LIBCPP_NONUNIQUE_RTTI_BIT flags in libc++. I wonder if ARM64 iOS also sets _LIBCXX_DYNAMIC_FALLBACK for libc++abi?)

I'm wondering whether there's a compatibility concern here w.r.t. previous versions of Clang. My first guess is that compatibility with G++/libstdc++/libsupc++ (and correctness) is sufficient to motivate changing Clang. I guess Clang would have to generate different code for -stdlib=libstdc++ and -stdlib=libc++?

Test case:

test.h

#include <typeinfo>
#include <stddef.h>
#include <stdio.h>

struct Base {
    virtual ~Base() {}
};

namespace def {
    Base *alloc();
    const std::type_info &type();
}

test-def.cc

#include "test.h"

namespace {
    struct A : Base {};
}

namespace def {
    Base *alloc() {
        return new A;
    }
    const std::type_info &type() {
        return typeid(A);
    }
}

test-run.cc

#include "test.h"

namespace {
    struct A : Base {
        void func() {
            printf("ERROR: run func called, field=%d\n", field);
        }
    private:
        int field = 42;
    };
}

__attribute__((noinline))
static A *do_cast(Base *b) {
    return dynamic_cast<A*>(b);
}

__attribute__((noinline))
static bool types_eq(const std::type_info &x, const std::type_info &y) {
    return x == y;
}

int main() {
    printf("def A  == run A:          %d\n", types_eq(def::type(), typeid(A)));
    printf("&def A == &run A:         %d\n", &def::type() == &typeid(A));
    printf("name of def A:            %s\n", def::type().name());
    printf("name of run A:            %s\n", typeid(A).name());
    printf("def A name == run A name: %d\n", def::type().name() == typeid(A).name());
    Base *b = def::alloc();
    auto *p = do_cast(b);
    if (p == nullptr) {
        printf("SUCCESS: dynamic_cast returned nullptr\n");
    } else {
        p->func();
    }
#ifdef __GXX_TYPEINFO_EQUALITY_INLINE
    printf("__GXX_TYPEINFO_EQUALITY_INLINE = %d\n", __GXX_TYPEINFO_EQUALITY_INLINE);
#endif
#ifdef __GXX_MERGED_TYPEINFO_NAMES
    printf("__GXX_MERGED_TYPEINFO_NAMES    = %d\n", __GXX_MERGED_TYPEINFO_NAMES);
#endif
}

$ cat /etc/issue Ubuntu 14.04.5 LTS \n \l $ uname -m x86_64

$ g++ test-def.cc test-run.cc -std=c++11 && ./a.out

def A  == run A:          0
&def A == &run A:         0
name of def A:            N12_GLOBAL__N_11AE
name of run A:            N12_GLOBAL__N_11AE
def A name == run A name: 0
SUCCESS: dynamic_cast returned nullptr
__GXX_TYPEINFO_EQUALITY_INLINE = 1
__GXX_MERGED_TYPEINFO_NAMES    = 0

$ ~/clang+llvm-5.0.0-linux-x86_64-ubuntu14.04/bin/clang++ test-def.cc test-run.cc -std=c++11 && ./a.out

def A  == run A:          1
&def A == &run A:         0
name of def A:            N12_GLOBAL__N_11AE
name of run A:            N12_GLOBAL__N_11AE
def A name == run A name: 0
ERROR: run func called, field=0
__GXX_TYPEINFO_EQUALITY_INLINE = 1
__GXX_MERGED_TYPEINFO_NAMES    = 0

$ g++ test-def.cc -S && cat test-def.s ... _ZTSN12_GLOBALN_11AE: .string "*N12_GLOBALN_11AE" ...

$ ~/clang+llvm-5.0.0-linux-x86_64-ubuntu14.04/bin/clang++ test-def.cc -S && cat test-def.s ... _ZTSN12_GLOBALN_11AE: .asciz "N12_GLOBALN_11AE" ...

llvmbot commented 2 years ago

mentioned in issue llvm/llvm-bugzilla-archive#45549

rjmccall commented 4 years ago

I guess Clang would have to generate different code for -stdlib=libstdc++ and -stdlib=libc++?

The difference is by target platform, not by target standard library. Darwin generally uses the strict Itanium ABI rule, but on arm64 it tweaks it as you observe. GCC decided years ago to diverge from the official Itanium rule in a few ways, and I think this is just another aspect of that divergence. Since GCC defines the ABI for Linux, both Clang and libc++ need to use GCC's modified rule on Linux and probably several other targets.

smeenai commented 4 years ago

Simple example showing another manifestation of this:

$ cat a.cpp
namespace {
struct S {};
} // namespace
void f() { throw S(); }

$ cat b.cpp
namespace {
struct S {};
} // namespace
void f();
int main() {
  try {
    f();
  } catch (S &) {
  }
}

$ g++ a.cpp b.cpp
$ ./a.out
terminate called after throwing an instance of '(anonymous namespace)::S'
[1]    3562076 abort (core dumped)  ./a.out

$ clang++ a.cpp b.cpp
$ ./a.out # exits successfully

libc++ gained the ability to perform string equality-based typeinfo comparisons in https://reviews.llvm.org/rL361913, but those don't take the leading asterisk into account either.