AbsInt / CompCert

The CompCert formally-verified C compiler
https://compcert.org
Other
1.88k stars 228 forks source link

Incompatibility of wide string literals with unsigned versions of type `wchar_t` #475

Closed monniaux closed 1 year ago

monniaux commented 1 year ago

version 8a558403 configured for aarch64-linux

#include <wchar.h>
wchar_t toto[] = L"Höhe";

fails

wchar.c:2: error: type mismatch between array destination and wide string literal
xavierleroy commented 1 year ago

Can you attach the preprocessed file? (The output of ccomp -E wchar.c.)

monniaux commented 1 year ago

Certainly. For what it's worth it's on Ubuntu 20.04 AArch64, but the same occurs also if cross-compiled from x86 Linux.

wchar.txt

The relevant part is

typedef signed int wchar_t;
m-schmidt commented 1 year ago

I'm currently not able to reproduce this behaviour with the current version of CompCert.

monniaux commented 1 year ago
$ ./ccomp -S wchar.c 
wchar.c:2: error: type mismatch between array destination and wide string literal
1 error detected.
$ git rev-parse --short HEAD
fc79b13a
m-schmidt commented 1 year ago

Ah, the code that produces the error is (in the attached wchar.txt):

wchar_t toto[]=U"Hähnchen";

while the example in Comment 1 above:

wchar_t toto[] = L"Höhe";

produces no error.

CompCert treats U literals as unsigned int. Therefore it clashes with a signed wchar_t. ISO C 6.4.4.4 paragraph 11 has the details:

A wide character constant prefixed by the letter L has type wchar_t, an integer
type defined in the <stddef.h> header; a wide character constant prefixed by the
letter u or U has type char16_t or char32_t, respectively, unsigned integer
types defined in the <uchar.h> header.

Can you test whether the following works on Ubuntu:

#include <uchar.h>
char32_t toto[] = U"Höhe";
monniaux commented 1 year ago
$ cat > toto.c
#include <uchar.h>
char32_t toto[] = U"Höhe";

$ ./chamois/ccomp -S toto.c 
$

Ok. This is AArch64 Ubuntu 20.04 LTS

m-schmidt commented 1 year ago

Then we can close this issue?

monniaux commented 1 year ago

Well, isn't

wchar_t toto[]=L"xxx";

supposed to work?

m-schmidt commented 1 year ago

Yes, and it works for me. Your original code in comment 1 is not the one from the attached wchar.txt.

monniaux commented 1 year ago

Here is what happens on a Raspberry Pi :

$~/CompCert$ cat > toto.c
#include <wchar.h>
wchar_t toto[]=L"xxx";
~/CompCert$ ./ccomp -S toto.c
toto.c:2: error: type mismatch between array destination and wide string literal
1 error detected.
monniaux@miete:~/CompCert$ cat /etc/lsb-release 
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=20.04
DISTRIB_CODENAME=focal
DISTRIB_DESCRIPTION="Ubuntu 20.04.5 LTS"
~/CompCert$ uname -a
Linux XXXXX 5.4.0-1056-raspi #63-Ubuntu SMP PREEMPT Mon Mar 14 07:48:06 UTC 2022 aarch64 aarch64 aarch64 GNU/Linux
~/CompCert$ git rev-parse --short HEAD
fc79b13a
$ ./ccomp -E toto.c -o toto.txt

toto.txt

m-schmidt commented 1 year ago

The problem is not specific for AArch64 or Linux. It's simply the signedness of wchar_t which is platform specific. All our test systems so far seem to have a signed wchar_t. Unlike comment 3 indicates, the compile error occurs when wchar_t is unsigned.

I will adapt the issue title.

xavierleroy commented 1 year ago

Was closed by mistake, reopening.

xavierleroy commented 1 year ago

This should be fixed by 2db601dc7c8055d2b666b14d57dc6d5d1c7cc077 . Note, however, that this particular issue didn't show up in normal use of CompCert, since CompCert's own <stddef.h> header defined wchar_t consistently with what the compiler expects for wide string literals. (The definition was not always compliant with the ABI, however.) The issue triggers because of

~/CompCert$ ./ccomp -S toto.c

When invoking ccomp before installation, always give the -stdlib option, so that standard headers and libraries can be found:

~/CompCert$ ./ccomp -stdlib ./runtime  -S toto.c