microsoft / STL

MSVC's implementation of the C++ Standard Library.
Other
10.23k stars 1.51k forks source link

<format>: Grapheme clusterization #1945

Closed barcharcraz closed 2 years ago

barcharcraz commented 3 years ago

Format should do grapheme clusterization when determining the width of a string for padding and alignment purposes. This is a quality of implementation issue, and something we should fix sometime in the c++20 timeframe.

We could implement the clusterization ourselves, or use ICU. Using ICU is perhaps easier, but means "real" function calls all over the place, doing it ourselves could mean quite a lot of code in user executables.

fsb4000 commented 3 years ago

As a user, I would prefer "doing it ourselves" because otherwise we lose support for Windows 7 and Windows 8.1...

barcharcraz commented 3 years ago

Yeah, and we'd then have three different possible behaviors.

Unfortunately doing it ourselves means shipping the unicode properties with the stl, either in a header, a satelite-dll or the import lib. This is rough.

barcharcraz commented 3 years ago

We've decided we'll just do it ourselves, compiling the unicode properties file (from unicode 13.0) as a static data structure and shipping that datastructure as an inline constexpr literal in <format> This way things will work on old versions of windows and we don't have to deal with the extremely slow behavior of ICU's C iterators

The "compiler" from the text properties file to our datastructure will be in the STL repo, but won't be run as part of normal builds since the unicdode properties update so rarely.