cplusplus / papers

ISO/IEC JTC1 SC22 WG21 paper scheduling and management
615 stars 19 forks source link

LWG2959 char_traits<char16_t>::eof is a valid UTF-16 code unit #1572

Open jwakely opened 1 year ago

jwakely commented 1 year ago

https://cplusplus.github.io/LWG/issue2959

SG16 should consider this issue that's been open for some time.

tahonermann commented 1 year ago

Be careful what you wish for; when it comes to char_traits, SG16 might just vote to burn it all to the ground!

SG16 has had its own tracking issue for this problem at https://github.com/sg16-unicode/sg16/issues/32 since 2018 and it has a fair amount of discussion. Unfortunately, it doesn’t look like there is a solution that wouldn’t require changing the type of int_type; which would be an ABI breaking change.

I can schedule a discussion in SG16, but perhaps it would be worthwhile to discuss this with the ABI review group first.

tahonermann commented 2 months ago

SG16 discussed this issue, along with several others, during the 2023-10-25 SG16 meeting.

The WG21 ABI group was consulted to determine if any ABI tricks could be deployed to allow for the char_traits<char16_t>::int_type type to be changed to a larger type without creating binary compatibility problems. No such tricks were identified. However, an audit of uses of the type in the C++ standard revealed that there is unlikely to be any significant use of the type or the functions that depend on it in the wild. This perspective is summarized nicely by a comment on the related SG16 issue.

I recently reached out to standard library implementors to get their perspective on simply changing char_traits<char16_t>::int_type as a DR. @jwakely conducted some tests involving packages built for the RedHat ecosystem and was unable to identify any code that would be affected. I'll leave it to him to summarize his approach and findings. I'm going to work with libc++ maintainers to conduct some similar tests and will report back on those once available.