Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

#pragma weak doesn't work #4080

Open Quuxplusone opened 15 years ago

Quuxplusone commented 15 years ago
Bugzilla Link PR3679
Status CONFIRMED
Importance P normal
Reported by Pawel Worach (pawel.worach@gmail.com)
Reported on 2009-02-27 09:47:16 -0800
Last modified on 2017-05-29 14:56:27 -0700
Version unspecified
Hardware PC FreeBSD
CC ahornung@gmail.com, anton@korobeynikov.info, bagnara@cs.unipr.it, cameron.mcinally@nyu.edu, daniel@zuster.org, dgregor@apple.com, ed@80386.nl, efriedma@quicinc.com, eocallaghan@alterapraxis.com, erik@cederstrand.dk, fedor.v.sergeev@gmail.com, hfinkel@anl.gov, kwm@FreeBSD.org, llvm-bugs@lists.llvm.org, mayur.p@samsung.com, mayurthebond@gmail.com, parseerror@gmail.com, rdivacky@freebsd.org, uspoerlein@gmail.com
Fixed by commit(s)
Attachments t.c (128 bytes, text/plain)
t.s (1307 bytes, text/plain)
weak-14.c (915 bytes, text/plain)
Blocks PR10899, PR4696
Blocked by
See also
Created attachment 2619
Test case

0>root@one /tmp# gcc -o t t.c
0>root@one /tmp# ./t
spasiba
0>root@one /tmp# ccc -o t t.c
/tmp/tmpTGEXC1.o(.text+0x2b): In function `main':
: undefined reference to `t'
Quuxplusone commented 15 years ago

Attached t.c (128 bytes, text/plain): Test case

Quuxplusone commented 15 years ago

Wow... It seems like really ugly way to define symbol aliases... Pawel, could you please attach the .s file generated by gcc?

Quuxplusone commented 15 years ago

Attached t.s (1307 bytes, text/plain): Assembly from gcc

Quuxplusone commented 15 years ago
(In reply to comment #2)
> Created an attachment (id=2620) [details]
> Assembly from gcc
>
> This is in the FreeBSD ZFS code and Sun seems to use #pragma weak a lot.
Yeah, this should be lowered to weak alias:
    .weak   t
    .set    t,hello
Quuxplusone commented 15 years ago
This also affects FreeBSD's rtld-elf.

clang output:
        .file   "x.c"
        .type   x,@object
        .data
        .globl x
        .align  4
x:                              # x
        .size   x, 4
        .zero   4

gcc output:
        .file   "x.c"
        .weak   x
        .section        .bss
        .align 4
        .type   x, @object
        .size   x, 4
x:
        .zero   4
        .ident  "GCC: (GNU) 4.2.1 20070719  [FreeBSD]"

code:
extern int x;
#pragma weak x

int x = 0;
Quuxplusone commented 15 years ago

Seems to do exactly the same thing as attribute((weak)), as expected.

Quuxplusone commented 15 years ago

It seems it's also allowed to just use #pragma weak. In that case it applies to the beforementioned symbol.

Quuxplusone commented 15 years ago
Test case:
--
// RUN: clang -verify -emit-llvm -o %t %s &&
// RUN: grep '@g0 = weak global i32 0' %t &&
// RUN: grep '@g1 = weak global i32 0' %t &&
// RUN: grep '@g2 = alias i32* @g3' %t &&
// RUN: grep '@g3 = i32 0' %t &&
// RUN: grep '@g4 = weak i32 0' %t

/* expecting-warning {{malformed #pragma weak, ignored}} */ #pragma weak
int g0;
#pragma weak g0

#pragma weak g1
int g1;
#pragma weak g2 = g3
int g3 = 0;

int g4;
#pragma weak
--

The first occurrence is not hard to support, we could just add a weak attribute
to g0.

The second occurrence (g1) is annoying because the pragma can occur before the
definition, which means we will need to remember the weak pragmas until later
(perhaps only in Sema, however).

The third occurrence (g2 aliases g3) has the same problem as the second, and
also introduces a new symbol (which, unlike normal alias, is not declared). If
we solve the second problem, we can probably still get away with handling this
by introducing attributes.

The fourth occurrence (g4) has yet another problem that we need access to the
previous decl; one can think of a whole new set of painful test cases for this
particular usage. For example, what does the following do:
--
int g0,
#pragma weak
g1;
--
I think we should treat this usage as unsupported; upgrading this instance
should be relatively painless.

Another test case that should be added is the alias form of #pragma weak which
conflicts with another variable; e.g.:
--
int g0;
#pragma weak g1 = g0;
float g1;
--
Quuxplusone commented 15 years ago

Doug and I discussed this a bit today. Our overwhelming opinion is that all but the first usage (from #7) are really gross and we would lean towards not supporting them if at all possible. How much work would fixing this in the original source be?

This seems like a pretty esoteric feature and using attributes for this task is a much nicer solution, so NTBF doesn't seem out of the question. However, a little googling does turn up some other references to it, for example it looks like IBM implements it, so it may be more widespread than I think.

Quuxplusone commented 15 years ago

It isn't hard for us to change the code, but the problem we're having, is that we also have a lot of contributed vendor code in our source tree. I think I saw the first usage being used a Sun ZFS library, for example. The problem is that it's hard convince the maintainers to just edit their vendor code.

But still, supporting at least a subset of possible notations would be an improvement. As long as Clang throws errors when using unsupported constructs, it's fine by me. The problem is that they go by unnoticed right now, which leads to all sorts of miscompilations.

Quuxplusone commented 15 years ago

That makes sense (vendor contributed issues).

We will certainly at least error on the construct if we are going to try to not support it. I'll discuss this some more tomorrow with interested parties and try to get a consensus.

Quuxplusone commented 15 years ago
Just wondering if there's any progress on this bug, either as a patch or
WONTFIX . It's preventing LLVM from compiling rtld-elf, csu, libc, libutil,
librt and libzpool on FreeBSD.

I've compiled a list of the documented behaviour of this directive in other
compilers:

http://gcc.gnu.org/onlinedocs/gcc/Weak-Pragmas.html
http://docs.sun.com/app/docs/doc/819-5265/bjaby#bjacz
http://h30097.www3.hp.com/cplus/ugu_impl.html#weak_sec
http://publib.boulder.ibm.com/infocenter/comphelp/v7v91/index.jsp?topic=/com.ibm.vacpp7a.doc/compiler/ref/rnpgweak.htm
http://techpubs.sgi.com/library/tpl/cgi-bin/getdoc.cgi/0650/bks/SGI_Developer/books/Pragmas/sgi_html/ch07.html#IG358315172

Thanks!
Quuxplusone commented 15 years ago
(In reply to comment #7)
> int g4;
> #pragma weak

This is the only form that really bothers me. The others are relatively easy to
track.
Quuxplusone commented 15 years ago
There are around 50 occurrences of #pragma weak in the FreeBSD source tree.
They are all either of the form

   #pragma weak a

or

   #pragma weak a = b
Quuxplusone commented 15 years ago

I've just committed r72907 and r72912, which are the start of an implementation; it's sufficient for writing code using #pragma weak that works with both clang and other compilers, but it's probably not enough to be compatible with a lot of already-written code.

Quuxplusone commented 15 years ago
Thanks for working on this! As far as I can see from your patches, the
following are now supported:

  int g0;
  #pragma weak g0

  int g1 = 0;
  #pragma weak g2 = g1

It seems a lot of code uses the form:

  #pragma weak g2 = long_function
  int
  long_function () { ... }

but at least it can be rearranged so it compiles.
Quuxplusone commented 15 years ago
(In reply to comment #15)
> Thanks for working on this! As far as I can see from your patches, the
> following are now supported:
>
>   int g0;
>   #pragma weak g0

Right.

>   int g1 = 0;
>   #pragma weak g2 = g1

Wrong; the supported form is the following:

extern void g2;
#pragma weak g2 = g1

Basically, the #pragma weak can't introduce a new name, but it can alias an
existing name to a name that isn't defined yet.
Quuxplusone commented 15 years ago

These are just preliminary results, but it seems this is sufficient for Solaris's libzpool, but not libzfs and libc. I'll take a closer look at this after I've finished my builds. Stay tuned.

Quuxplusone commented 15 years ago
Unfortunately in case of libc, where we mark functions as weak like this

void
func(void)
{
        ...
}
#pragma weak

it doesn't seem to have any effect. This is the diff between nm output on libc
built with gcc and clang:

--- libc.gcc    2009-06-06 18:57:53.000000000 +0200
+++ libc.clang  2009-06-06 18:57:49.000000000 +0200
@@ -949,10 +949,10 @@
 PTR T _rpc_dtablesize
 PTR W _rtld_allocate_tls
 PTR W _rtld_atfork_post
-PTR W _rtld_atfork_pre
-PTR W _rtld_error
+PTR T _rtld_atfork_pre
+PTR T _rtld_error
 PTR W _rtld_free_tls
-PTR W _rtld_thread_init
+PTR T _rtld_thread_init
 PTR W _rtprio
 PTR W _rtprio_thread
 PTR W _sched_get_priority_max
@@ -1259,16 +1259,16 @@
 PTR T digittoint
 PTR T dirname
 PTR T div
-PTR W dl_iterate_phdr
-PTR W dladdr
-PTR W dlclose
-PTR W dlerror
-PTR W dlfunc
-PTR W dlinfo
-PTR W dllockinit
-PTR W dlopen
-PTR W dlsym
-PTR W dlvsym
+PTR T dl_iterate_phdr
+PTR T dladdr
+PTR T dlclose
+PTR T dlerror
+PTR T dlfunc
+PTR T dlinfo
+PTR T dllockinit
+PTR T dlopen
+PTR T dlsym
+PTR T dlvsym
 PTR W dn_comp
 PTR W dn_count_labels
 PTR W dn_expand
Quuxplusone commented 15 years ago

Errr...

pragma weak func

that's what we use.

Quuxplusone commented 15 years ago
Ah, yes. It turns out the issue is a little more complex than I thought.

This works:

void
func1(void)
{
}

#pragma weak func1

void
func2(void)
{
}

#pragma weak func2

while this does not:

void
func1(void)
{
}

void
func2(void)
{
}

#pragma weak func1
#pragma weak func2
Quuxplusone commented 15 years ago

Ah, right... the current approach for Sema isn't really correct. I'm still not quite sure about the best way to write it.

Quuxplusone commented 15 years ago
r77660 handles the original test case and should cover the subsequent issues
brought up by the FreeBSD libraries (though I have not tested them directly).
the case mentioned in comment #20 is not covered due to the way clang's Parser
and ASTConsumer currently work. clang/test/Sema/pragma-weak.c cover all the
cases I could think of, including combinations of #pragma weak and other
attributes, including alias(...). if i've forgotten anything please let me know.

#pragma weak ok
int ok;

int ok2;
#pragma weak ok2

#pragma weak alias_ok = __alias_ok;
int __alias_ok;

int broken;
int broken2;
#pragma weak broken
Quuxplusone commented 15 years ago

Is this done enough for FreeBSD, or are important cases still missing?

Quuxplusone commented 15 years ago
(In reply to comment #23)
> Is this done enough for FreeBSD, or are important cases still missing?

I think comment 20 is an important case (according to comment 18, it shows up
in FreeBSD libc).  It's kind of nasty, though, because it's essentially
applying an attribute after the definition, which we don't generally support.
Quuxplusone commented 15 years ago
i'm sorry i wasn't more specific; my patch addresses everything except this
case from comment 20:

void
func1(void)
{
}

void
func2(void)
{
}

#pragma weak func1
#pragma weak func2

that is, it won't work if the #pragma weak does not appear before or in the
same "TopLevelDecl" as the identifier it references, due to the way Parser and
ASTConsumer currently work. i couldn't find this particular case in FreeBSD's
libc at http://fxr.watson.org/fxr/search?v=FREEBSD-LIBC&string=%23pragma+weak
but i also haven't compiled FreeBSD's libc myself so I can't be 100% certain.
Quuxplusone commented 15 years ago
I checked the uses of pragma weak in FreeBSD sources. Of the 50 occurrencies, 5
are still not covered by the latest patches:

http://svn.freebsd.org/viewvc/base/head/sys/cddl/contrib/opensolaris/uts/common/rpc/opensolaris_xdr.c?view=markup
line 50
http://svn.freebsd.org/viewvc/base/head/lib/librt/sigev_thread.c?view=markup
line 70
http://svn.freebsd.org/viewvc/base/head/lib/libc/resolv/res_comp.c?view=markup
line 257
http://svn.freebsd.org/viewvc/base/head/cddl/contrib/opensolaris/lib/libgen/common/gmatch.c?view=markup
line 36
http://svn.freebsd.org/viewvc/base/head/contrib/gcclibs/libmudflap/mf-
runtime.c?view=markup line 201

I'm really no C expert, but I expect that we can rewrite 2 and 3 to something
parseable by LLVM. 5 is less important for obvious reasons. This leaves 1 and 4
which are Solaris code.

I'll leave it to others to recommend an upstream patch, a local patchset or
simply gcc-compile this specific code for the time being.
Quuxplusone commented 15 years ago
Just a side note,

We, AuroraUX are running into some of the very same issues as the FreeBSD guys
as:
A.) We are building a kernel+userspace using clang.
B.) We have Solaris code too as we are based off the OpenSolaris kernel.

As of (B.) the affects on us are current more severe just due to the number of
occurrences of this sort of thing.

Although our porting effort is not as far along as the Clang-FreeBSD lads; I
say this as it _may_ affect the out come of if this bug gets 'fully' fixed or
code twisted to suit clang better.

As both GCC and Sun Studio handle this code, (libzfs), I vote clang should to.

Just my 2 pennies :)
Cheers,
Edward.
Quuxplusone commented 15 years ago

Thanks for the feedback guys, I'm on it.

Quuxplusone commented 15 years ago
(In reply to comment #26)
> I checked the uses of pragma weak in FreeBSD sources. Of the 50 occurrencies,
5
> are still not covered by the latest patches:
>
>
http://svn.freebsd.org/viewvc/base/head/sys/cddl/contrib/opensolaris/uts/common/rpc/opensolaris_xdr.c?view=markup
> line 50
> http://svn.freebsd.org/viewvc/base/head/lib/librt/sigev_thread.c?view=markup
> line 70
> http://svn.freebsd.org/viewvc/base/head/lib/libc/resolv/res_comp.c?view=markup
> line 257
>
http://svn.freebsd.org/viewvc/base/head/cddl/contrib/opensolaris/lib/libgen/common/gmatch.c?view=markup
> line 36
> http://svn.freebsd.org/viewvc/base/head/contrib/gcclibs/libmudflap/mf-
runtime.c?view=markup
> line 201
>
> I'm really no C expert, but I expect that we can rewrite 2 and 3 to something
> parseable by LLVM. 5 is less important for obvious reasons. This leaves 1 and
4
> which are Solaris code.
>
> I'll leave it to others to recommend an upstream patch, a local patchset or
> simply gcc-compile this specific code for the time being.
>

Erik,

As far as I can tell the first case should work, i.e.

pizza@pie:~/proj/llvm/tools/clang$ cat pragma-weak-unused.c
#pragma weak foo = bar
void bar(){}
pizza@pie:~/proj/llvm/tools/clang$ clang -c pragma-weak-unused.c && nm pragma-
weak-unused.o
00000000 T bar
00000000 W foo
pizza@pie:~/proj/llvm/tools/clang$ gcc -c pragma-weak-unused.c && nm pragma-
weak-unused.o
00000000 T bar
00000000 W foo

Could you provide a testcase?
Quuxplusone commented 15 years ago
(In reply to comment #29)
> As far as I can tell the first case should work, i.e.
>
> pizza@pie:~/proj/llvm/tools/clang$ cat pragma-weak-unused.c
> #pragma weak foo = bar
> void bar(){}

I should have added that my review was only by grep'ing, not actual compile
failures. I thought the pragma weak needed to be adjacent to the statement, but
apparently this also works:

#pragma weak foo = bar
void baz(){}
void bar(){}

In that case I agree that 1 and probably 3 in comment #26 are OK.
Quuxplusone commented 15 years ago
(In reply to comment #30)
> (In reply to comment #29)
> > As far as I can tell the first case should work, i.e.
> >
> > pizza@pie:~/proj/llvm/tools/clang$ cat pragma-weak-unused.c
> > #pragma weak foo = bar
> > void bar(){}
>
> I should have added that my review was only by grep'ing, not actual compile
> failures. I thought the pragma weak needed to be adjacent to the statement,
but
> apparently this also works:
>
> #pragma weak foo = bar
> void baz(){}
> void bar(){}
>
> In that case I agree that 1 and probably 3 in comment #26 are OK.
>

r78016 adds the test file clang/test/Sema/pragma-weak.c which outlines exactly
what is and is not supported.
Quuxplusone commented 15 years ago
Daniel and I figured out why the code in comment #20 does what it does, and
it's rather funny. The basic problem is that we cannot handle a

  #pragma weak foo

that occurs after the function foo() was defined, because CodeGen has already
seen the definition and does not know to change the definition to have weak
linkage. We'll need to add another ASTConsumer hook---say,
AttributeAddedToDecl(Attr*, Decl*)---that CodeGen can use to change the linkage
and/or add an alias.

Why does the code at the beginning of comment #20 work, then? Well, it's very
funny: when the lexer tries to consume the '}' token to finish the function
definition, it goes ahead and finds the next token. In doing so, the #pragma
weak handler fires... and the function that is being defined gets marked as
weak *before* CodeGen gets the definition. It's almost like we'd written

  void func1(void) {
    #pragma weak func1
  }
Quuxplusone commented 15 years ago

Doug, what is the status of this? I recall we had a nice discussion about it, but I forget what we concluded... :)

Quuxplusone commented 15 years ago

AFAIK, nothing has changed since comment 32.

Quuxplusone commented 15 years ago
(In reply to comment #33)
> Doug, what is the status of this? I recall we had a nice discussion about it,
> but I forget what we concluded... :)

We concluded that we need to add ASTConsumer::AttributeAddedToDecl, to cope
with cases where we see #pragma weak after we see the definition of the entity.
There hasn't been any implementation progress, though.
Quuxplusone commented 14 years ago

I would like to add that the code of several packages in Fedora (10, 11 and 12) cannot be successfully parsed because of this problem.

Quuxplusone commented 14 years ago
Another case of weakref. This time with __attibute__

> cat weak.c
void
hello(void)
{
}
static void t(void) __attribute__((weakref("hello")));
> gcc -c weak.c
> clang -c weak.c
weak.c:5:36: warning: 'weakref' attribute ignored
static void t(void) __attribute__((weakref("hello")));
                                   ^
1 diagnostic generated.
Quuxplusone commented 14 years ago
(In reply to comment #37)
> Another case of weakref.

That isn't really the same issue; could you file a separate bug?
Quuxplusone commented 14 years ago

Sure, filed comment #37 as bug 5621.

Quuxplusone commented 12 years ago
There is a bug in #pragma weak, relating to the use of static keyword along
with a variable whose alias we create through the use of #pragma weak. I have
attached the file weak-14.c, here Av1a is created as an alias to lv1. lv1 is
static here and there is a declaration of Av1a (extern unsigned long Av1a) just
after Av1a has been declared as the alias to lv1(using #pragma weak).

The problem being faced is that Av1a should have its value as the value of lv1,
which is not the case and hence the if condition is failing and the test case
is aborting(which is not the correct behaviour). For the case of function Af1a
everything works fine.

When we check for the value of Av1a using a printf statement, we see that it is
0. The test case passes when we comment out the check in the if condition, ||
Av1a != 0xdeadbeefUL.

What is interesting is that, when we remove static keyword from lv1, this case
passes and Av1a is assigned the correct value. Also when we comment extern
unsigned long Av1a(the declaration of Av1a) and dont remove the static keyword
from lv1, the case works fine, with correct value getting assigned to Av1a.

Compilation command to create executable:
clang -O2 -fno-common weak-14.c
Quuxplusone commented 12 years ago

Attached weak-14.c (915 bytes, text/plain): test file for the failure of pragma weak with use of static keyword for a variable.

Quuxplusone commented 11 years ago
It was just brought to my attention that this may be causing problems with
MPICH as well: https://trac.mpich.org/projects/mpich/ticket/1815
(but it could be that the way they're mixing weak and extern is not really
correct?)
Quuxplusone commented 11 years ago
In the file SemaDeclAttr.cpp
NamedDecl * Sema::DeclClonePragmaWeak(NamedDecl *ND, IdentifierInfo *II,
                                      SourceLocation Loc){
..
..
  } else if (VarDecl *VD = dyn_cast<VarDecl>(ND)) {
    NewD = VarDecl::Create(VD->getASTContext(), VD->getDeclContext(),
                           VD->getInnerLocStart(), VD->getLocation(), II,
                           VD->getType(), VD->getTypeSourceInfo(),
                           VD->getStorageClass(),
                           VD->getStorageClassAsWritten());
    if (VD->getQualifier()) {
      VarDecl *NewVD = cast<VarDecl>(NewD);
      NewVD->setQualifierInfo(VD->getQualifierLoc());
    }
  }
  return NewD;
}

replace VD->getStorageClass()and VD->getStorageClassAsWritten() with SC_None.

This solves the problem related to pragma weak. It might be applicable for your
problem as well.