Open wg21bot opened 1 year ago
P2728R0: Unicode in the Library, Part 1: UTF Transcoding
P2729R0: Unicode in the Library, Part 2: Normalization
2023-02-07 19:30 to 22:00 UTC-8 Issaquah Library Evolution Minutes
Champion: Zach Laine (IP)
Chair: Bryce Adelstein Lelbach (IP) & Ben Craig (IP)
Minute Taker: Robert Leahy (IP)
Start: 2023-02-07 19:41 UTC-8
Does this paper have:
Open Questions:
Typo in P2728 section 2: "3 UTF-8 code units in sequence may encode a particular code unit" -> the second "code unit" should be "code point".
Typo in P2729 section 4.2: is_normalized
calls in the examples should take the format.
Typo in P2729 section 5.2: Unicode versions should have types.
Why utf_8_to_16_iterator
instead of utf8_to_16_iterator
? Why not use a template parameter for the sizes?
Should formats be enumerators, or should each be its own trivial type?
Maybe the fast but verbose code example shouldn't be the first one in the paper.
Transcoding iterators should model the iterator category of the underlying iterator.
Unicode version should be queried with runtime functions, not constexpr variables.
Why use template parameters for normalization forms but not UTFs? I'd prefer consistency.
End: 21:56
We took an early look at P2728 and P2729, which propose Unicode facilities for the C++ Standard Library. The proposal includes both low level facilities which should have speed of light performance, and higher level facilities that are composable and easy to use (such as views and ranges).
Proceed with review and incubation in the Text and Unicode study group.
@tahonermann please send this to Library Evolution when it's ready.
SG16 reviewed P2728R0 (Unicode in the Library, Part 1: UTF Transcoding) during its 2023-03-22 and 2023-04-12 meetings. The following polls were taken during the latter meeting.
Attendees: 10 (3 abstentions) | SF | F | N | A | SA |
---|---|---|---|---|---|
4 | 2 | 0 | 1 | 0 |
charN_t
types, with support for other types provided by adapters, possibly with a
special case for char
and wchar_t
when their associated literal encodings are UTF.
Attendees: 9 (2 abstentions) | SF | F | N | A | SA |
---|---|---|---|---|---|
5 | 1 | 0 | 0 | 1 |
char32_t
should be used as the Unicode code point type within the C++ standard
library implementations of Unicode algorithms.
Attendees: 9 (2 abstentions) | SF | F | N | A | SA |
---|---|---|---|---|---|
6 | 0 | 1 | 0 | 0 |
Further action on this paper is now pending an updated revision.
SG9 (Ranges) reviewed D2728R4 during the Varna meeting on 2023-06-12 (Full Minutes).
POLL: Move null_sentinel_t to std:: namespace
SF | F | N | A | SA |
---|---|---|---|---|
1 | 3 | 1 | 0 | 0 |
# Of Authors: 1
Author’s Position: F
Attendance: 9
Outcome: Consensus in Favor
POLL: Remove null_sentinel_t::base member function from the proposal
SF | F | N | A | SA |
---|---|---|---|---|
0 | 4 | 1 | 0 | 0 |
# Of Authors: 1
Author’s Position: F
Attendance: 8
Outcome: Consensus in Favor
POLL: utf_iterator should be a separate type and not nested within utf_view
SF | F | N | A | SA |
---|---|---|---|---|
1 | 2 | 1 | 0 | 1 |
Attendance: 8
# of Authors: 1
Author Position: F
Outcome: Weak consensus in favor
SA: Having a separate type complexifies the API
SG9 will continue reviewing the paper during the following telecons
SG16 reviewed P2728R6 during its 2023-08-23 meeting and its 2023-09-13 meeting. Changes are now expected before review continues, so I am adding the needs-revision label.
P2728R0 Unicode in the Library, Part 1: UTF Transcoding (Zach Laine)