This can be configured by the compiler to force wchar_t to be 2 or 4 bytes using the -fshort-wchar flag. However, this has extreme consequences. Hardware vendors warn that all linked objects must use the same wchar_t size, including libraries. It is then not possible or at the very least unstable to link an object file compiled with -fshort-wchar, with another object file that is compiled without -fshort-wchar. It is not clear what happens when dynamic loading a library, but for NativeAOT dynamic linking is a real thing for C#.
This makes wchar_t by default not a good scenario for single-source cross-platform bindings.
What's worse is that in C# strings are UTF-16 where char is 2 bytes. This means that some marshalling either by hand or otherwise has to be done to get correct behaviour for passing wchar_t strings between C and C# on Linux.
Microsoft has a discussion of introducing a UTF8String, but this would only be helpful for dealing with the interoperability of char* not wchar_t*.
Options:
Enforce the use of -fshort-wchar compiler flag for all users of C2CS so that wchar_t is guaranteed to be 2 bytes. This has the consequence that users will need to re-compile their C code to be compliant.
Use by hand marshalling or an ICustomMarshaler with different implementations for Windows and Linux so that one C# .cs file for bindings can be used correctly for Windows and Linux when passing wchar_t* between C# and C.
Warn users of C2CS that usage of wchar_t* falls into the same category as pointers and thus different .cs files of bindings will need to be generated for each ABI. For example, a different .cs file would need to be generated for Windows and Linux where wchar_t usage is correct.
Problem:
wchar_t
is 2 bytes by default.wchar_t
is 4 bytes by default.This can be configured by the compiler to force
wchar_t
to be 2 or 4 bytes using the-fshort-wchar
flag. However, this has extreme consequences. Hardware vendors warn that all linked objects must use the samewchar_t
size, including libraries. It is then not possible or at the very least unstable to link an object file compiled with-fshort-wchar
, with another object file that is compiled without-fshort-wchar
. It is not clear what happens when dynamic loading a library, but for NativeAOT dynamic linking is a real thing for C#.This makes
wchar_t
by default not a good scenario for single-source cross-platform bindings. What's worse is that in C# strings are UTF-16 wherechar
is 2 bytes. This means that some marshalling either by hand or otherwise has to be done to get correct behaviour for passingwchar_t
strings between C and C# on Linux.Microsoft has a discussion of introducing a
UTF8String
, but this would only be helpful for dealing with the interoperability ofchar*
notwchar_t*
.Options:
-fshort-wchar
compiler flag for all users ofC2CS
so thatwchar_t
is guaranteed to be 2 bytes. This has the consequence that users will need to re-compile their C code to be compliant.ICustomMarshaler
with different implementations for Windows and Linux so that one C#.cs
file for bindings can be used correctly for Windows and Linux when passingwchar_t*
between C# and C.C2CS
that usage ofwchar_t*
falls into the same category as pointers and thus different.cs
files of bindings will need to be generated for each ABI. For example, a different.cs
file would need to be generated for Windows and Linux wherewchar_t
usage is correct.