llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
27.96k stars 11.54k forks source link

asm label with the same name as static variable leads to: Assertion failed: (Symbol->isUndefined() && "Cannot define a symbol twice!") #20152

Open DimitryAndric opened 10 years ago

DimitryAndric commented 10 years ago
Bugzilla Link 19778
Version trunk
OS All
CC @asl,@emaste,@zygoloid,@rnk

Extended Description

I received a bugreport about the following assertion failure in clang, when building libglapi (a component of Mesa):

https://jenkins.freebsd.org/pci/head-i386/logs/bulk/headi386-default/753/logs/errors/libglapi-9.1.7.log

After some reducing, the testcase turns out to be very small:

asm("x86_entry_end:"); static x86_entry_end[]; entry_generate() { int *code_templ = x86_entry_end - 1; memcpy(0, code_templ, 32); }

Compiling this with clang trunk r209086 (with assertions enabled), and any optimization level above -O1, gives:

Assertion failed: (Symbol->isUndefined() && "Cannot define a symbol twice!"), function EmitLabel, file /share/dim/src/llvm/trunk/lib/MC/MCAsmStreamer.cpp, line 309. [... etc ...]

The code in question originates here in the Mesa repository:

http://cgit.freedesktop.org/mesa/mesa/tree/src/mapi/entry_x86_tls.h#n46

Now the question is: is this "legal" in any sense? E.g. defining your own inline asm label, which has the same name as a static char array? Mesa seems to have this construction since some time, so gcc apparently has no trouble with it.

In any case, I think clang should not assert on this. Either reject the construction (but break Mesa), or deal with it. :)

DimitryAndric commented 2 years ago

mentioned in issue llvm/llvm-bugzilla-archive#29169

DimitryAndric commented 2 years ago

mentioned in issue llvm/llvm-bugzilla-archive#27932

DimitryAndric commented 2 years ago

mentioned in issue llvm/llvm-bugzilla-archive#23541

DimitryAndric commented 8 years ago

Bug llvm/llvm-bugzilla-archive#29169 has been marked as a duplicate of this bug.

DimitryAndric commented 8 years ago

Is there any action for LLVM to take here? It sounds like mesa is making unreasonable assumptions about the compiler's generated code.

That's all very well, but LLVM crashes. It should print instead "you cannot do this", point out why, and ideally, suggest a workaround.

ec04fc15-fa35-46f2-80e1-5d271f2ef708 commented 8 years ago

Is there any action for LLVM to take here?

Rafael's comment#4 example still causes llc to assert.

rnk commented 8 years ago

Is there any action for LLVM to take here? It sounds like mesa is making unreasonable assumptions about the compiler's generated code.

Also, this module level asm doesn't look like it will work with LLVM, given that we emit all module asm together before emitting functions. If that's the core issue, then we should dupe it against http://llvm.org/pr6364.

DimitryAndric commented 8 years ago

Bug llvm/llvm-bugzilla-archive#27932 has been marked as a duplicate of this bug.

DimitryAndric commented 9 years ago

Bug llvm/llvm-bugzilla-archive#23541 has been marked as a duplicate of this bug.

DimitryAndric commented 10 years ago

For now, I changed the x86_entry_start/end items to extern declarations instead:

extern const char x86_entry_start[]; extern const char x86_entry_end[];

That seems to have the desired effect, e.g. the functions that use the symbols can access them, though the resulting assembly is slightly different (it uses @​GOT instead of @​GOTOFF):

Adding attribute((visibility("hidden"))) should give you back the GOTOFF access pattern.

Yes, that works fine for both clang and gcc. Thanks. :)

llvmbot commented 10 years ago

For now, I changed the x86_entry_start/end items to extern declarations instead:

extern const char x86_entry_start[]; extern const char x86_entry_end[];

That seems to have the desired effect, e.g. the functions that use the symbols can access them, though the resulting assembly is slightly different (it uses @​GOT instead of @​GOTOFF):

Adding attribute((visibility("hidden"))) should give you back the GOTOFF access pattern.

Now I only need some way to "declare" a static symbol, making it usable from C functions... :)

No idea if there is a general way to do that.

DimitryAndric commented 10 years ago

...

So the program has undefined behavior because it violates a 'shall' requirement in a 'semantics' rule. In -pedantic-errors mode at least, we should probably reject this.

I agree.

As to whether we should support this particular code as a GNU extension... I don't know, it's fairly heinous. We don't in general guarantee the names of static entities, so this code seems broken regardless of whether we choose to support this particular case.

In this particular case, the desired construction is a generated jump table of some sorts:

asm( ".text\n" ".balign 32\n" "x86_entry_start:"); [... generated stuff: ...] asm( ".hidden shared_dispatch_stub_0\n" ".globl shared_dispatch_stub_0\n" ".type shared_dispatch_stub_0, @​function\n" ".balign 32\n" "shared_dispatch_stub_0:\n" " movl _glapi_Dispatch, %eax\n" " testl %eax, %eax\n" " je 1f\n" " jmp (40)(%eax)\n" "1:\n" " call _glapi_get_dispatch\n" " jmp (40)(%eax)\n" asm( ".hidden shared_dispatch_stub_1\n" ".globl shared_dispatch_stub_1\n" ".type shared_dispatch_stub_1, @​function\n" ".balign 32\n" "shared_dispatch_stub_1:\n" " movl _glapi_Dispatch, %eax\n" " testl %eax, %eax\n" " je 1f\n" " jmp (41)(%eax)\n" "1:\n" " call _glapi_get_dispatch\n" " jmp (41)(%eax)\n" [... much more of that ...] asm( ".hidden shared_dispatch_stub_1025\n" ".globl shared_dispatch_stub_1025\n" ".type shared_dispatch_stub_1025, @​function\n" ".balign 32\n" "shared_dispatch_stub_1025:\n" " movl _glapi_Dispatch, %eax\n" " testl %eax, %eax\n" " je 1f\n" " jmp (41025)(%eax)\n" "1:\n" " call _glapi_get_dispatch\n" " jmp (41025)(%eax)\n" asm( ".balign 32\n" "x86_entry_end:"); [...] static const char x86_entry_start[]; static const char x86_entry_end[]; [...] mapi_func entry_get_public(int slot) { return (mapi_func) (x86_entry_start + slot 32); } [...] mapi_func entry_generate(int slot) { const char code_templ = x86_entry_end - 32; void *code; mapi_func entry;

code = u_execmem_alloc(32); if (!code) return ((void *)0);

memcpy(code, code_templ, 32); entry = (mapi_func) code; entry_patch(entry, slot);

return entry; }

E.g. the intent is to get at the (aligned) start and end addresses of the asm statements. If there is any easier way to get at those, I can propose it to the FreeBSD port maintainers and/or Mesa upstream.

For now, I changed the x86_entry_start/end items to extern declarations instead:

extern const char x86_entry_start[]; extern const char x86_entry_end[];

That seems to have the desired effect, e.g. the functions that use the symbols can access them, though the resulting assembly is slightly different (it uses @​GOT instead of @​GOTOFF):

@@ -12340,7 +12340,7 @@ addl $_GLOBAL_OFFSETTABLE, %ecx movl 4(%esp), %eax sall $5, %eax

Now I only need some way to "declare" a static symbol, making it usable from C functions... :)

llvmbot commented 10 years ago

A testcase for just the "don't crash" side of this bug would be

module asm "x86_entry_end:" @​x86_entry_end = constant i32 42

ec04fc15-fa35-46f2-80e1-5d271f2ef708 commented 10 years ago

C11 6.9.2/2:

"A declaration of an identifier for an object that has file scope without an initializer, and without a storage-class specifier or with the storage-class specifier static, constitutes a tentative definition. If a translation unit contains one or more tentative definitions for an identifier, and the translation unit contains no external definition for that identifier, then the behavior is exactly as if the translation unit contains a file scope declaration of that identifier, with the composite type as of the end of the translation unit, with an initializer equal to 0."

6.9.2/3:

"If the declaration of an identifier for an object is a tentative definition and has internal linkage, the declared type shall not be an incomplete type."

So the program has undefined behavior because it violates a 'shall' requirement in a 'semantics' rule. In -pedantic-errors mode at least, we should probably reject this.

As to whether we should support this particular code as a GNU extension... I don't know, it's fairly heinous. We don't in general guarantee the names of static entities, so this code seems broken regardless of whether we choose to support this particular case.

In any case, we shouldn't assert if the backend sees a symbol defined twice through such shenanigans.

llvmbot commented 10 years ago

Given just

static int x86_entry_end[]; void f(int*); void entry_generate() { f(x86_entry_end); }

clang produces

@​x86_entry_end = internal global [1 x i32] zeroinitializer, align 4

That is, it thinks x86_entry_end is a definition. It also prints the warning

test2.c:1:12: warning: tentative array definition assumed to have one element

gcc seems to think it is a declaration.

Changing the code to

static int x86_entry_end[1];

makes gcc output a definition too.

DimitryAndric commented 10 years ago

Hi Rafael, do you have any opinion about this bug? :)