llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.79k stars 11.9k forks source link

Add a documentation specifying the position of the libc project wrt wide chararacter and multibyte char support #59290

Open sivachandra opened 1 year ago

sivachandra commented 1 year ago

We want to start with the position that:

  1. Wide-char encoding will be that followed by the compiler. On most platforms it is UTF-32 and on Windows it is UTF-16.
  2. Multi-byte to wide-char conversions assume utf-8 encoding of the multi-byte character strings irrespective of the current locale.
  3. Wide-char to multi-byte conversions will convert to utf-8 encoded multi-byte strings irrespective of the current locale.
  4. Explicit conversions will be as per the explicit expectation. For example, wide-char to char16_t string conversions will perform explicit conversion from the wide-char encoding to the UTF-16 encoding.
llvmbot commented 1 year ago

@llvm/issue-subscribers-libc