Open LB-- opened 9 years ago
There is some good discussion in https://what.thedailywtf.com/t/so-about-strings-and-unicode/52250/
Also, Perl 6 seems to be handling things in the ideal way: https://6guts.wordpress.com/2015/04/12/this-week-unicode-normalization-many-rts/
Separate types and treatments for bytes, code points, and graphemes.
Certainly, UTF-8 will only be used via byte buffers (which are possibly behind a strong type alias) and signed types will be nowhere nearby. Signed code units can die in a fire.
There is some good discussion in https://what.thedailywtf.com/t/so-about-strings-and-unicode/52250/
Also, Perl 6 seems to be handling things in the ideal way: https://6guts.wordpress.com/2015/04/12/this-week-unicode-normalization-many-rts/
Separate types and treatments for bytes, code points, and graphemes.