Quuxplusone / LLVMBugzillaTest

0 stars 0 forks source link

clang ignores -fno-constant-cfstrings #3269

Closed Quuxplusone closed 14 years ago

Quuxplusone commented 14 years ago
Bugzilla Link PR6056
Status RESOLVED FIXED
Importance P normal
Reported by Jonathan Schleifer (js-llvm-bugzilla@webkeks.org)
Reported on 2010-01-16 08:11:53 -0800
Last modified on 2010-10-22 01:27:40 -0700
Version trunk
Hardware All All
CC fjahanian@apple.com, llvm-bugs@lists.llvm.org
Fixed by commit(s)
Attachments
Blocks
Blocked by
See also
When specifying -fno-constant-cfstrings on OS X, it seems that clang ignores it
and I get the following linker errors:

Undefined symbols:
  "___CFConstantStringClassReference", referenced from:
      cfstring= in OFArray.o
      cfstring=Allocating an object failed! in OFExceptions.o
      cfstring=Could not allocate %zu bytes in class %s! in OFExceptions.o
[…]

On Linux, it seems to work.
Quuxplusone commented 14 years ago

This bug still exists and makes clang completely unusable for everything that does not use Cocoa on MacOS X.

Quuxplusone commented 14 years ago

It looks like that flag causes gcc to reference __NSConstantStringClassReference instead of ___CFConstantStringClassReference, and emit the data slightly differently. I don't know of anyone working on this or anyone else who needs it. If you're interested in it, please send a patch to cfe-dev.

Quuxplusone commented 14 years ago
gcc does not always reference __NSConstantStringClassReference, it references
__*ClassReference when using -fconstant-string-class=*. clang instead seems to
always reference ___CFConstantStringClassReference.
This becomes an issue as soon as you want to use a different constant string
class on OS X. It woks fine on Linux with clang.
Quuxplusone commented 14 years ago

This bug still exists in the latest revision. I looked at the code and it seems that CGObjCMac.cpp just does not even care to check whether -fconstant-string-class has been specified. I looked at the same code in CGObjCGNU.cpp and it seems does a LOT more. Maybe the code from CGObjCGNU.cpp can be just copied?

Anyway, can you please fix that? As this breaks clang on Mac completely with anything that is not Cocoa.

Quuxplusone commented 14 years ago
This is mostly implemented in TOT, with the following patches:

http://llvm.org/viewvc/llvm-project?view=rev&revision=102112
http://llvm.org/viewvc/llvm-project?view=rev&revision=102130
http://llvm.org/viewvc/llvm-project?view=rev&revision=102189
http://llvm.org/viewvc/llvm-project?view=rev&revision=102219
http://llvm.org/viewvc/llvm-project?view=rev&revision=102223
And a test case:
http://llvm.org/viewvc/llvm-project?view=rev&revision=102357

Currently, to get this feature, you must pass -fno-constant-cfstrings directly
to the compiler
using -cc1 options; as in:
clang -cc1 -fno-constant-cfstrings .... file.m

You may ask why?  Well, because I haven't figure out yet how to pass this
option directly to the
driver part of the compiler. If you know how, patch is most welcome :).
Quuxplusone commented 14 years ago
Addition of patch: http://llvm.org/viewvc/llvm-project?view=rev&revision=102431
(by Daniel) completes this PR.
Quuxplusone commented 14 years ago

Now it always references NSConstantString and ignores what's specified by -fconstant-string-class, leading to not even compiling it now, as it can't find the interface for NSConstantString.

Quuxplusone commented 14 years ago
(In reply to comment #7)
> Now it always references NSConstantString and ignores what's specified by
> -fconstant-string-class, leading to not even compiling it now, as it can't
find
> the interface for NSConstantString.

Darwin does not yet support -fconstant-string-class. Something I need to look
at when I get the
chance.
Quuxplusone commented 14 years ago

Using the latest revision (compiled 5 minutes ago), this problem still exists and prevents any code that uses a different constant string class from compilation. Even worse, when NSConstantString is available, it just uses it without a warning. And it also issues a warning about -fobjc-exceptions now.

Quuxplusone commented 14 years ago

Just installed clang on a NetBSD 5.1 box where it has the same bug. So, it is now far worse than before the initial report of this bug. This makes clang not only completely unusable for ObjC on OS X, but on every platform.

Quuxplusone commented 14 years ago
Sorry, I promised a while back to finish up the feature. It is not the feature
that very many people use
so it ended up at the end of a long queue of feature requests. I will implement
it for the next DP Rev.

(In reply to comment #10)
> Just installed clang on a NetBSD 5.1 box where it has the same bug. So, it is
> now far worse than before the initial report of this bug. This makes clang not
> only completely unusable for ObjC on OS X, but on every platform.
Quuxplusone commented 14 years ago

Thanks :).

Well, basically, you need it for everything that does not use Cocoa. That might be few people on OS X, true, but as it seems it affects other systems as well, it should affect most people there, as there's no Cocoa.

Quuxplusone commented 14 years ago

Remaining work is implemented in TOT: http://llvm.org/viewvc/llvm-project?view=rev&revision=116819

Quuxplusone commented 14 years ago

It seems that compilation works now. However, it seems clang breaks it as soon as umlauts etc. are in the string.

Consider an UTF-8 encoded file containing the string @"ä". The C string generated by clang should be "ä", where a C string just copies the encoding from the source file. Thus the C string should be { 0xC3, 0xA4, 0x00 } and the length 2. However, clang seems to create garbage here now. This used to work in the past.

I'm not sure if this is caused by the recent change, but I assume so. If it's not, feel free to close and I will create a new bug report.

Quuxplusone commented 14 years ago
This is unrelated to recent changes. Following test case has the same problem:

#include <stdio.h>

#include <objc/objc.h>
#include <objc/Object.h>

int main () {
  NSLog(@"ä\n");
  return 0;
}

_.str:
        .asciz   "\344\000\n\000\000"

        .section        __DATA,__cfstring
        .align  4                       ## @_unnamed_cfstring_
L__unnamed_cfstring_:
        .quad   ___CFConstantStringClassReference
        .long   2000                    ## 0x7d0
        .space  4
        .quad   _.str
        .quad   2

Please file a new bug report. I don't think that this ever worked. I tried an
older version of
clang with same results.

(In reply to comment #14)
> It seems that compilation works now. However, it seems clang breaks it as soon
> as umlauts etc. are in the string.
>
> Consider an UTF-8 encoded file containing the string @"ä". The C string
> generated by clang should be "ä", where a C string just copies the encoding
> from the source file. Thus the C string should be { 0xC3, 0xA4, 0x00 } and the
> length 2. However, clang seems to create garbage here now. This used to work
in
> the past.
>
> I'm not sure if this is caused by the recent change, but I assume so. If it's
> not, feel free to close and I will create a new bug report.
Quuxplusone commented 14 years ago

Oh, this definitely worked before, because all my tests ran with clang in the past. And @"" literals with UTF-8 were used all the time.

Anyway, I looked into this and this is a real problem. If it's an ASCII string, everything works fine. If it contains any non-ASCII characters, clang creates a Unicode string. However, it does not indicate ANYWHERE that the string is Unicode!

PS: All of the problems mentioned in this ticket are new, as some time ago, it all worked just fine. The encoding bug seems to be very recent and I'm very sure this has to do with the whole -fconstant-string-class thing being changed / rewritten.

Quuxplusone commented 14 years ago
(In reply to comment #16)
> Oh, this definitely worked before, because all my tests ran with clang in the
> past. And @"" literals with UTF-8 were used all the time.

 I just showed a test case which has *noting* to do with -fconstant-string-class
stuff and showed the string encoding was incorrect. Maybe it work targeting Gnu
runtime
but not the NeXt runtime.

>
> Anyway, I looked into this and this is a real problem. If it's an ASCII
string,
> everything works fine. If it contains any non-ASCII characters, clang creates
a
> Unicode string. However, it does not indicate ANYWHERE that the string is
> Unicode!

Yes, this is the problem. But I don't think that it ever worked.

>
> PS: All of the problems mentioned in this ticket are new, as some time ago, it
> all worked just fine. The encoding bug seems to be very recent and I'm very
> sure this has to do with the whole -fconstant-string-class thing being changed
> / rewritten.

Please show a test case and some proof that it worked before. I just showed a
test case which
produced the same encoding before and after -fconstant-string-class change.
Quuxplusone commented 14 years ago

Please open a new bug for this new issue.