Closed stepancheg closed 11 years ago
This looks good. What do you think about making Unicode characters and UTF-8 strings the default, so that:
UniChar
becomes Char
UniString?
becomes String?
String
becomes an alias to UTF8[Vector[Byte]]
, StringLiteralRef
becomes an alias to UTF8[ByteStringLiteralRef]
, etc.You'd lose O(1) random access and size, but the world runs on UTF-8 anyway, and it'd be nice for strings to be Unicode out of the box.
@jckarter what should be String?
renamed to?
Maybe ByteString
(and ByteChar
)?
@jckarter I need to think about it. May I push these patches in the meantime?
That seems reasonable. I didn't mean to intend you migrate everything to Unicode now—that's obviously going to need more work and design.
Pull request contains three commits:
String?
protocols (it is no longer overloadable, string is aSequence?
ofChar
and nothing more). A lot of code assumes thatString?
is a sequence (hasiterator
)UniString?
protocol.UniString?
is likeString?
, but overUniChar
printTo
forUniString?
: sequence ofUniChar
is printed encoded as UTF-8 now, instead of comma-separated sequence of characters