llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
27.7k stars 11.39k forks source link

`__CHAR16_TYPE__` and `__CHAR32_TYPE__` are defined on MacOS but they don't actually work #41443

Open MarcusJohnson91 opened 5 years ago

MarcusJohnson91 commented 5 years ago
Bugzilla Link 42098
Version unspecified
OS MacOS X
CC @DougGregor,@zygoloid

Extended Description

MacOS's SDK does not include the uchar.h header, and since these macros are defined by the compiler the macro check doesn't work, breaking my code.

I filed a bug report with Apple about it, bug #51322492 but they don't seem to care.

Example code to get the bug to appear:

#ifdef  __CHAR32_TYPE__
typedef   char32_t       UTF32;
#else
typedef   uint_least32_t UTF32;
#endif /* __CHAR32_TYPE__ */
ec04fc15-fa35-46f2-80e1-5d271f2ef708 commented 5 years ago

So I guess the issue is Apple not providing the uchar.h header.

Ah, OK, yes I guess so.

Is there [a feature test macro] for Unicode types as well?

The C standard only provides feature test macros for features that it considers to be optional. It considers char16_t, char32_t, and <uchar.h> to be mandatory features required for conformance, so unfortunately there's no standard feature test macro. (In an ideal world you could test __STDC_VERSION__, but that will just parrot back to you that you said -std=c11 and won't take into account that <uchar.h> is missing.)

Do you know of a more direct way to check if those types exist instead of just excluding an entire class of OSes?

Perhaps:

#ifdef __has_include
#if __has_include(<uchar.h>)
#include <uchar.h>
typedef char32_t UTF32;
#define HAVE_UTF32
#endif
#endif

#if !defined(HAVE_UTF32) && defined(__CHAR32_TYPE__)
typedef __CHAR32_TYPE__ UTF32;
#define HAVE_UTF32
#endif

#if !defined(HAVE_UTF32)
#include <stdint.h>
typedef uint_least32_t UTF32;
#define HAVE_UTF32
#endif

#if !defined(HAVE_UTF32)
#error couldn't determine suitable char32_t type
#endif

... or something like that? (Though just using uint_least32_t unconditionally would be a lot simpler, and it's required to be the same type as char32_t on any conforming C11 implementation.)

MarcusJohnson91 commented 5 years ago

Hey Richard, thanks for responding.

_CHAR16_TYPE and CHAR32_TYPE__ are intended to be used by the implementation of in order to define char16_t and char32_t. They do not indicate whether the types char16_t and char32_t are already defined

Makes sense

it would make sense to not define them in language modes where those types don't exist. GCC only defines them in C11 and C++11 onwards and Clang should do the same.

That's not the issue, I'm defining -std=c11 in all of my projects.

So I guess the issue is Apple not providing the uchar.h header.

I was looking over the C standard yesterday and there's a few feature macros like __STDC_NO_COMPLEX__, __STDC_NO_THREADS__, etc.

Is there one for Unicode types as well?

Also, I tried playing around with using the preprocessor to detect if those types existed but that doesn't work either.

Do you know of a more direct way to check if those types exist instead of just excluding an entire class of OSes?

ec04fc15-fa35-46f2-80e1-5d271f2ef708 commented 5 years ago

Regarding comment#0:

__CHAR16_TYPE__ and __CHAR32_TYPE__ are intended to be used by the implementation of <uchar.h> in order to define char16_t and char32_t. They do not indicate whether the types char16_t and char32_t are already defined; instead, they specify what unsigned integer type should be used for 16- / 32-bit characters. (By definition, they are the same types as uint_least_16_t and uint_least_32_t, respectively, per C11 7.28/2.)

If for whatever reason you don't want to (or can't) use <uchar.h> to obtain those types, and don't want to unconditionally use uint_least_16_t / uint_least_32_t, the appropriate usage would be:

#ifdef __CHAR32_TYPE__
typedef __CHAR32_TYPE__ UTF32;
#else
typedef uint_least32_t UTF32;
#endif

Regarding comment#1:

I think this is a bug; __STDC_UTF_16__ and __STDC_UTF_32__ describe the behavior of char16_t and char32_t, and it would make sense to not define them in language modes where those types don't exist. GCC only defines them in C11 and C++11 onwards and Clang should do the same.

MarcusJohnson91 commented 5 years ago

Forgot to mention:

__STDC_UTF_16__ and __STDC_UTF_32__ are also defined when they shouldn't be