Closed Quuxplusone closed 14 years ago
Bugzilla Link | PR6056 |
Status | RESOLVED FIXED |
Importance | P normal |
Reported by | Jonathan Schleifer (js-llvm-bugzilla@webkeks.org) |
Reported on | 2010-01-16 08:11:53 -0800 |
Last modified on | 2010-10-22 01:27:40 -0700 |
Version | trunk |
Hardware | All All |
CC | fjahanian@apple.com, llvm-bugs@lists.llvm.org |
Fixed by commit(s) | |
Attachments | |
Blocks | |
Blocked by | |
See also |
This bug still exists and makes clang completely unusable for everything that does not use Cocoa on MacOS X.
It looks like that flag causes gcc to reference __NSConstantStringClassReference instead of ___CFConstantStringClassReference, and emit the data slightly differently. I don't know of anyone working on this or anyone else who needs it. If you're interested in it, please send a patch to cfe-dev.
gcc does not always reference __NSConstantStringClassReference, it references
__*ClassReference when using -fconstant-string-class=*. clang instead seems to
always reference ___CFConstantStringClassReference.
This becomes an issue as soon as you want to use a different constant string
class on OS X. It woks fine on Linux with clang.
This bug still exists in the latest revision. I looked at the code and it seems that CGObjCMac.cpp just does not even care to check whether -fconstant-string-class has been specified. I looked at the same code in CGObjCGNU.cpp and it seems does a LOT more. Maybe the code from CGObjCGNU.cpp can be just copied?
Anyway, can you please fix that? As this breaks clang on Mac completely with anything that is not Cocoa.
This is mostly implemented in TOT, with the following patches:
http://llvm.org/viewvc/llvm-project?view=rev&revision=102112
http://llvm.org/viewvc/llvm-project?view=rev&revision=102130
http://llvm.org/viewvc/llvm-project?view=rev&revision=102189
http://llvm.org/viewvc/llvm-project?view=rev&revision=102219
http://llvm.org/viewvc/llvm-project?view=rev&revision=102223
And a test case:
http://llvm.org/viewvc/llvm-project?view=rev&revision=102357
Currently, to get this feature, you must pass -fno-constant-cfstrings directly
to the compiler
using -cc1 options; as in:
clang -cc1 -fno-constant-cfstrings .... file.m
You may ask why? Well, because I haven't figure out yet how to pass this
option directly to the
driver part of the compiler. If you know how, patch is most welcome :).
Addition of patch: http://llvm.org/viewvc/llvm-project?view=rev&revision=102431
(by Daniel) completes this PR.
Now it always references NSConstantString and ignores what's specified by -fconstant-string-class, leading to not even compiling it now, as it can't find the interface for NSConstantString.
(In reply to comment #7)
> Now it always references NSConstantString and ignores what's specified by
> -fconstant-string-class, leading to not even compiling it now, as it can't
find
> the interface for NSConstantString.
Darwin does not yet support -fconstant-string-class. Something I need to look
at when I get the
chance.
Using the latest revision (compiled 5 minutes ago), this problem still exists and prevents any code that uses a different constant string class from compilation. Even worse, when NSConstantString is available, it just uses it without a warning. And it also issues a warning about -fobjc-exceptions now.
Just installed clang on a NetBSD 5.1 box where it has the same bug. So, it is now far worse than before the initial report of this bug. This makes clang not only completely unusable for ObjC on OS X, but on every platform.
Sorry, I promised a while back to finish up the feature. It is not the feature
that very many people use
so it ended up at the end of a long queue of feature requests. I will implement
it for the next DP Rev.
(In reply to comment #10)
> Just installed clang on a NetBSD 5.1 box where it has the same bug. So, it is
> now far worse than before the initial report of this bug. This makes clang not
> only completely unusable for ObjC on OS X, but on every platform.
Thanks :).
Well, basically, you need it for everything that does not use Cocoa. That might be few people on OS X, true, but as it seems it affects other systems as well, it should affect most people there, as there's no Cocoa.
Remaining work is implemented in TOT: http://llvm.org/viewvc/llvm-project?view=rev&revision=116819
It seems that compilation works now. However, it seems clang breaks it as soon as umlauts etc. are in the string.
Consider an UTF-8 encoded file containing the string @"ä". The C string generated by clang should be "ä", where a C string just copies the encoding from the source file. Thus the C string should be { 0xC3, 0xA4, 0x00 } and the length 2. However, clang seems to create garbage here now. This used to work in the past.
I'm not sure if this is caused by the recent change, but I assume so. If it's not, feel free to close and I will create a new bug report.
This is unrelated to recent changes. Following test case has the same problem:
#include <stdio.h>
#include <objc/objc.h>
#include <objc/Object.h>
int main () {
NSLog(@"ä\n");
return 0;
}
_.str:
.asciz "\344\000\n\000\000"
.section __DATA,__cfstring
.align 4 ## @_unnamed_cfstring_
L__unnamed_cfstring_:
.quad ___CFConstantStringClassReference
.long 2000 ## 0x7d0
.space 4
.quad _.str
.quad 2
Please file a new bug report. I don't think that this ever worked. I tried an
older version of
clang with same results.
(In reply to comment #14)
> It seems that compilation works now. However, it seems clang breaks it as soon
> as umlauts etc. are in the string.
>
> Consider an UTF-8 encoded file containing the string @"ä". The C string
> generated by clang should be "ä", where a C string just copies the encoding
> from the source file. Thus the C string should be { 0xC3, 0xA4, 0x00 } and the
> length 2. However, clang seems to create garbage here now. This used to work
in
> the past.
>
> I'm not sure if this is caused by the recent change, but I assume so. If it's
> not, feel free to close and I will create a new bug report.
Oh, this definitely worked before, because all my tests ran with clang in the past. And @"" literals with UTF-8 were used all the time.
Anyway, I looked into this and this is a real problem. If it's an ASCII string, everything works fine. If it contains any non-ASCII characters, clang creates a Unicode string. However, it does not indicate ANYWHERE that the string is Unicode!
PS: All of the problems mentioned in this ticket are new, as some time ago, it all worked just fine. The encoding bug seems to be very recent and I'm very sure this has to do with the whole -fconstant-string-class thing being changed / rewritten.
(In reply to comment #16)
> Oh, this definitely worked before, because all my tests ran with clang in the
> past. And @"" literals with UTF-8 were used all the time.
I just showed a test case which has *noting* to do with -fconstant-string-class
stuff and showed the string encoding was incorrect. Maybe it work targeting Gnu
runtime
but not the NeXt runtime.
>
> Anyway, I looked into this and this is a real problem. If it's an ASCII
string,
> everything works fine. If it contains any non-ASCII characters, clang creates
a
> Unicode string. However, it does not indicate ANYWHERE that the string is
> Unicode!
Yes, this is the problem. But I don't think that it ever worked.
>
> PS: All of the problems mentioned in this ticket are new, as some time ago, it
> all worked just fine. The encoding bug seems to be very recent and I'm very
> sure this has to do with the whole -fconstant-string-class thing being changed
> / rewritten.
Please show a test case and some proof that it worked before. I just showed a
test case which
produced the same encoding before and after -fconstant-string-class change.
Please open a new bug for this new issue.